Sunday, November 9, 2014

SMTP email conversation threads with python: making gmail associate messages into a thread

I have some python software that sends emails, and I wanted gmail to group messages that were related to the same subject in a conversation. It's not immediately obvious how this works, and there's plenty of bad advice out there, including people stating that you just need to add "RE:" to the subject line, which is just wrong.

The way conversation threads are constructed is by using the SMTP "Message-ID" as a reference to the original email in the "In-Reply-To" and "References" headers. RFC2822 has the details, explains some fairly complex multi-parent cases, and includes some good examples. My use case was very simple, just two messages I wanted associated. The first message is sent with a message-id, the second one references it (you'll need to store this ID somewhere if you want to send more messages in the same thread):
import email.utils
from email.mime.multipart import MIMEMultipart
import smtplib

myid = email.utils.make_msgid()
msg = MIMEMultipart("alternative")
msg["Subject"] = "test"
msg["From"] = "myuser@mycompany.com"
msg["To"] = "myuser@mycompany.com"
msg.add_header("Message-ID", myid)
s = smtplib.SMTP("smtp.mycompany.com")
s.sendmail("myuser@mycompany.com", ["myuser@mycompany.com"], msg.as_string())

msg = MIMEMultipart("alternative")
msg["Subject"] = "test"
msg["From"] = "myuser@mycompany.com"
msg["To"] = "myuser@mycompany.com"
msg.add_header("In-Reply-To", myid)
msg.add_header("References", myid)
s = smtplib.SMTP("smtp.mycompany.com")
s.sendmail("myuser@mycompany.com", ["myuser@mycompany.com"], msg.as_string())
Note that it is up to the host generating the message ID to guarantee it is unique. This function gives you an RFC-compliant message id and uses a datestamp to give you a unique ID. If you are sending lots of mail that may not be enough so you can pass it extra data that will get appended to the ID:
In [3]: import email.utils

In [4]: email.utils.make_msgid()
Out[4]: '<20141110055935 data-blogger-escaped-.21441.10732="" data-blogger-escaped-myhost.mycompany.com="">'

In [5]: email.utils.make_msgid('extrarandomsauce')
Out[5]: '<20141110060140 data-blogger-escaped-.21441.5878.extrarandomsauce="" data-blogger-escaped-myhost.mycompany.com="">'
I wasn't particularly careful with my first test message ID and sent a non-compliant one :) Gmail recognizes this and politely fixes it for you:
Message-ID: <545c12bc data-blogger-escaped-.240ada0a.6dba.5421smtpin_added_broken="" data-blogger-escaped-gmr-mx.google.com="">
X-Google-Original-Message-ID: testafasdfasdfasdfasdf

2 comments:

Anonymous said...

one thing to note would be that we have to mention header msg.add_header("Message-ID",myid)
again in each different script to post mail...if u are using a db to store the myid and feteching it in case of a ticket management system..I spent a long time to understand this..hope u got the point

G said...

Thanks, noted that in the text.