I’m writing a script that connects to a bunch of URLs over HTTPS, downloads their SSL certificate and extracts the CN. Everything works except when I stumble on a site with an invalid SSL certificate. I absolutely do not care if the certificate is valid or not. I just want the CN but Python stubbornly refuses to extract the certificate information if the certificate is not validated. Is there any way to get around this profoundly stupid behavior? Oh, I’m using the built-in socket and ssl libraries only. I don’t want to use third-party libraries like M2Crypto or pyOpenSSL because I’m trying to keep the script as portable as possible.
Here’s the relevant code:
file = open("list.txt", "r") for x in file: server = socket.getaddrinfo(x.rstrip(), "443") sslsocket = socket.socket() sslsocket.connect((server, 443)) sslsocket = ssl.wrap_socket(sslsocket, cert_reqs=ssl.CERT_REQUIRED, ca_certs="cacerts.txt") certificate = sslsocket.getpeercert()`
The ssl.get_server_certificate can do it:
import ssl ssl.get_server_certificate(("www.sefaz.ce.gov.br",443))
I think function doc string is more clear than python doc site:
"""Retrieve the certificate from the server at the specified address, and return it as a PEM-encoded string. If 'ca_certs' is specified, validate the server cert against it. If 'ssl_version' is specified, use it in the connection attempt."""
So you can extract common name from binary DER certificate searching for common name object identifier:
def get_commonname(host,port=443): oid='\x06\x03U\x04\x03' # Object Identifier 18.104.22.168 (COMMON NAME) pem=ssl.get_server_certificate((host,port)) der=ssl.PEM_cert_to_DER_cert(pem) i=der.find(oid) # find first common name (certificate authority) if i!=-1: i=der.find(oid,i+1) # skip and find second common name if i!=-1: begin=i+len(oid)+2 end=begin+ord(der[begin-1]) return der[begin:end] return None