The open blogging platform. Say no to algorithms and paywalls.

Using Python to Read and Save Your Outlook Emails!

Utilizing PyAutoGUI and win32com to read Outlook emails and save them to your system.

Use Case:

Downloading emails from Outlook and storing them so that you can trigger other processes or securely back up emails for auditing purposes.


Getting started:

  • Python3
  • Pip3
  • pip3 install pyautogui
  • pip3 install keyboard
  • pip3 install pywin32
  • Periods of time where you are actively using your computer
  • Windows Operating System

Setup:

Install python along with PyAutoGui and pywin32. This specific use case and program can only be used on Windows as we are accessing the via a Windows Component Object Model (COM). This article is assuming you’ve read my previous article here and would like to expand upon that idea (or you were curious what emailReader.py was).


Part 1. PyAutoGUI:

The first portion of this is getting your clicking set up. This is going to be extremely important as Window’s COM does not allow you to interact with it via a scheduled task easily. Using baseBot.py found here, set up a series of clicks to

  1. Open a terminal
  2. Go to the program’s directory
  3. Run a python program
  4. Wait for a set amount of time (I usually do 30 seconds to 1 minute depending on the program and it’s average time to run.)
  5. Close the program

💡 Speed up your blog creation with DifferAI.

Available for free exclusively on the free and open blogging platform, Differ.

To save time and to reuse perfectly good code, I will use the code at the end of the PyAutoGui article as a starting point for this code.

import pyautogui  
import logging  
import keyboard  
import time  
import argparse  
import sys  
  
logging.basicConfig(level=logging.INFO)  
  
# Set up logging  
def get_arg():  
    """ Takes nothing  
Purpose: Gets arguments from command line  
Returns: Argument's values  
"""  
    parser = argparse.ArgumentParser()  
    # Information  
    parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn on debugging",default=False)  
    # Functionality  
    parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on finder mode to see coordinates for mouse and colors",default=False)  
  
    options = parser.parse_args()  
    if options.debug:  
        logging.basicConfig(level=logging.DEBUG)  
        global DEBUG  
        DEBUG = True  
    else:  
        logging.basicConfig(level=logging.INFO)  
    return options  
  
def finder():  
    """ Takes nothing  
Purpose: Finds the mouse position and color  
Returns: Nothing  
"""  
    while keyboard.is_pressed('q') != True:  
        if keyboard.is_pressed('c') == True:  
            x, y = pyautogui.position()  
            r,g,b = pyautogui.pixel(x, y)  
  
            logging.info("Mouse position: {}, {}. R: {}. G: {}. B: {}.".format(x, y, r, g, b))  
            logging.info("\twin32api.SetCursorPos(({}, {}))".format(x, y))  
            logging.info("\tpyautogui.pixel({}, {})[0] == {} and pyautogui.pixel({}, {})[1] == {} and pyautogui.pixel({}, {})[2] == {}\n".format(x, y, r, x, y, g, x, y, b))  
            time.sleep(1)  
  
  
def typeWriter(text):  
    """ Takes text  
Purpose: Types out the text  
Returns: Nothing  
"""  
    if text == "ENTER":  
        pyautogui.press('enter')  
    else:  
        pyautogui.typewrite(text)  
        pyautogui.press('enter')  
  
  
def clicker(x,y):  
    """ Takes x and y coordinates  
Purpose: Clicks the location  
Returns: Nothing  
"""  
    pyautogui.click(x,y)  
  
  
def main():  
    options = get_arg()  
    logging.info("Starting program")  
    if options.find:  
        finder()  
        sys.exit(1)  
      
    if pyautogui.pixel(1496, 1434)[0] in range(40,60) and pyautogui.pixel(1496, 1434)[1] in range(40,60) and pyautogui.pixel(1496, 1434)[2] in range(40,60):  
      clicker(1496,1434) # Clicks the loction  
      time.sleep(3) # Wait for the program to load  
      typeWriter("cd testLocation") # Change to a different location  
      typeWriter("ENTER") # Press Enter  
      typeWriter("python emailReader.py") # Run emailReader.py program  
      typeWriter("ENTER") # Press Enter  
      time.sleep(60) # Wait 60 seconds  
      typeWriter("exit") # Close terminal  
      typeWriter("ENTER") # Press Enter  
    else:  
      logging.fatal("Color is not in range!") # Let user know that the color isn't in range  
  
if __name__ == "__main__":  
    main()

The above code s going to be the program that we call via a scheduled task. While I am going to separate the code so that I can keep things more organized, there is nothing wrong with adding an if statement to the above code and making another argument call the win32com functionality we are about to code.


Part 2. win32com:

This is where the more interesting part happens (and where we actually get to exploit bypassing win32com’s restrictions).

Starting off, I like using this as my boilerplate:

import win32com.client  
import win32com  
import re  
  
EMAILADDRESS = ""  
IGNOREDSENDER = [""]  
  
raw_emails = {}  
  
with open("monitor.txt", "r") as f:  
    lines = f.readlines()  
print(lines)  
  
  
def main():  
    accounts, outlook = init()  
    emails = getEmails(accounts, outlook)  
    print(emails)  
  
if __name__ == "__main__":  
    main()

The email address will be used for your email. Helpful if there are multiple accounts on your system but only want to scrape one of them. Ignored Sender is amazing if you have an automated system that sends emails that you do not want this program to interact with. Monitor.txt is extremely useful if you only care about emails with a certain subject line.

The next portion is the initialization portion. This is extremely short for this use case but can get more complex if you use different COM systems potentially.

def init():  
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")  
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts  
  
    return accounts, outlook

The “final” part is the actual meat and potatoes of this program. Since there are many portions that do a lot of things and it can get extremely confusing, I have made a lot of comments that explain what the line does under it if I do not think it is completely obvious.

def getEmails(accounts, outlook):  
    """Takes accounts and outlook  
Purpose: Gets emails from outlook  
Returns: Nothing  
"""  
    # Counter used for counting the amount of emails per subject.  
    count = 0  
  
    # Loop through all accounts  
    for account in accounts:  
        # print("Account: {}".format(account))  
  
        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS  
        if str(account).lower() == EMAILADDRESS.lower():  
            print("Account: {}".format(account))  
            folders = outlook.Folders(account.DeliveryStore.DisplayName)  
            specific_folder = folders.Folders  
  
            # Loop through all folders   
            for folder in specific_folder:      
                #Prints the current folder you are in  
                print("Folder: '{}'".format(folder))  
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders  
                if(folder.name == "Inbox"):  
                    messages = folder.Items  
  
                    # Loop through all messages  
                    for single in messages:  
  
                        # Check if the email subject is located in the monitor.txt file  
                        for subject in lines:  
  
                            # If statement looking for only those emails with the subject in the monitor.txt file  
                            if subject.strip() in single.Subject.lower():  
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.  
                                for sender in IGNOREDSENDER:  
                                    try:  
                                        if single.SenderName == sender.lower():  
                                            continue  
                                    except AttributeError:  
                                        pass  
                                # I've found that certain email senders can cause issues with these fields.  
                                try:  
                                    print("Sender: {}".format(single.Sender))  
                                    send = single.Sender  
                                except AttributeError:  
                                    try:  
                                        print("Sender: {}".format(single.SenderName))  
                                        send = single.SenderName  
                                    except AttributeError:  
                                        print("Sender: {}".format(single.SenderEmailAddress))  
                                        send = single.SenderEmailAddress  
  
                                # Prints subject  
                                print("Subject: {}".format(single.Subject))  
                                # Prints when the email was received  
                                print("Received Time: {}".format(single.ReceivedTime))  
                                # Prints if the email is unread or not  
                                print("Unread: {}".format(single.Unread))  
  
                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.  
                                loc = re.search("Confidentiality Notice", single.Body)  
                                emailStart = re.search("From:\s", single.Body)  
  
                                # This is used to show if one of them were found. If they were, it shows the regex match  
                                print("Email Start: {}".format(emailStart))  
                                print("Location: {}".format(loc))  
  
                                # Checks to see which one was found first and uses that as the end of the email body  
                                if emailStart and not loc:  
                                    end = emailStart.start()  
                                elif loc and not emailStart:  
                                    end = loc.start()  
                                elif emailStart and loc:  
                                    end = min(emailStart.start(), loc.start())  
                                else:  
                                    end = None  
  
                                # Captures the body of the email until the established end is found  
                                if end:  
                                    body = single.Body[:end]  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # Captures the entire email. If this is the first email in a chain, then this will trigger.  
                                else:  
                                    body = single.Body  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # This regex is used for tracking the amount of emails pertaining to a specific subject.  
                                regex = re.compile(r"(Regex)")  
                                name = re.findall(regex, str(single.Subject))  
                                if name:  
                                    name = name[0].strip()  
                                    print("Name: {}".format(name))  
                                    if name in raw_emails:  
                                        print("Before Loop:{}".format(name))  
                                        count = int(0)  
                                        while name in raw_emails:  
                                            testName = name + "_" + str(count)  
                                            if testName not in raw_emails:  
                                                name = testName  
                                                print("After Loop:{}".format(name))  
                                                break  
                                            count += 1  
                                            print("During Loop:{}".format(name))  
  
                                # Adds it to a dictionary so you can modify the data later or save it  
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}  
                                # Seperate 1 email from another   
                                print("-"*250+"\n\n")  
    # Prints all of the content  
    print(raw_emails)  
  
    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.  
    tmpEmails = raw_emails  
    tmpEmails = str(tmpEmails).replace('"', '|')  
    tmpEmails = str(tmpEmails).replace("'", '"')  
    tmpEmails = str(tmpEmails).replace("|", "'")  
  
    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument  
    # with open("emails.json", "w") as f:  
    #     f.write(tmpEmails)  
  
    # Saves the emails to a text file.  
    with open("emails.txt", "w") as f:  
        for key, value in raw_emails.items():  
            f.write("ID: {}\n".format(key))  
            f.write("Subject: {}\n".format(value["subject"]))  
            f.write("Sender: {}\n".format(value["sender"]))  
            f.write("Recieved: {}\n".format(value["received"]))  
            f.write("Unread: {}\n".format(value["unread"]))  
            try:  
                f.write("Body:\n{}\n".format(value["body"]))  
            except UnicodeEncodeError as e:  
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))  
            f.write("-"*250+"\n\n")  
  
    print("Finished Succesfully")  
    return raw_emails

The final program will look like this:

import win32com.client  
import win32com  
import re  
  
EMAILADDRESS = ""  
IGNOREDSENDER = [""]  
  
raw_emails = {}  
  
with open("monitor.txt", "r") as f:  
    lines = f.readlines()  
print(lines)  
  
  
def init():  
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")  
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts  
  
    return accounts, outlook  
  
  
def getEmails(accounts, outlook):  
    """Takes accounts and outlook  
Purpose: Gets emails from outlook  
Returns: Nothing  
"""  
    # Counter used for counting the amount of emails per subject.  
    count = 0  
  
    # Loop through all accounts  
    for account in accounts:  
        # print("Account: {}".format(account))  
  
        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS  
        if str(account).lower() == EMAILADDRESS.lower():  
            print("Account: {}".format(account))  
            folders = outlook.Folders(account.DeliveryStore.DisplayName)  
            specific_folder = folders.Folders  
  
            # Loop through all folders   
            for folder in specific_folder:      
                #Prints the current folder you are in  
                print("Folder: '{}'".format(folder))  
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders  
                if(folder.name == "Inbox"):  
                    messages = folder.Items  
  
                    # Loop through all messages  
                    for single in messages:  
  
                        # Check if the email subject is located in the monitor.txt file  
                        for subject in lines:  
  
                            # If statement looking for only those emails with the subject in the monitor.txt file  
                            if subject.strip() in single.Subject.lower():  
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.  
                                for sender in IGNOREDSENDER:  
                                    try:  
                                        if single.SenderName == sender.lower():  
                                            continue  
                                    except AttributeError:  
                                        pass  
                                # I've found that certain email senders can cause issues with these fields.  
                                try:  
                                    print("Sender: {}".format(single.Sender))  
                                    send = single.Sender  
                                except AttributeError:  
                                    try:  
                                        print("Sender: {}".format(single.SenderName))  
                                        send = single.SenderName  
                                    except AttributeError:  
                                        print("Sender: {}".format(single.SenderEmailAddress))  
                                        send = single.SenderEmailAddress  
  
                                # Prints subject  
                                print("Subject: {}".format(single.Subject))  
                                # Prints when the email was received  
                                print("Received Time: {}".format(single.ReceivedTime))  
                                # Prints if the email is unread or not  
                                print("Unread: {}".format(single.Unread))  
  
                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.  
                                loc = re.search("Confidentiality Notice", single.Body)  
                                emailStart = re.search("From:\s", single.Body)  
  
                                # This is used to show if one of them were found. If they were, it shows the regex match  
                                print("Email Start: {}".format(emailStart))  
                                print("Location: {}".format(loc))  
  
                                # Checks to see which one was found first and uses that as the end of the email body  
                                if emailStart and not loc:  
                                    end = emailStart.start()  
                                elif loc and not emailStart:  
                                    end = loc.start()  
                                elif emailStart and loc:  
                                    end = min(emailStart.start(), loc.start())  
                                else:  
                                    end = None  
  
                                # Captures the body of the email until the established end is found  
                                if end:  
                                    body = single.Body[:end]  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # Captures the entire email. If this is the first email in a chain, then this will trigger.  
                                else:  
                                    body = single.Body  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # This regex is used for tracking the amount of emails pertaining to a specific subject.  
                                regex = re.compile(r"(Regex)")  
                                name = re.findall(regex, str(single.Subject))  
                                if name:  
                                    name = name[0].strip()  
                                    print("Name: {}".format(name))  
                                    if name in raw_emails:  
                                        print("Before Loop:{}".format(name))  
                                        count = int(0)  
                                        while name in raw_emails:  
                                            testName = name + "_" + str(count)  
                                            if testName not in raw_emails:  
                                                name = testName  
                                                print("After Loop:{}".format(name))  
                                                break  
                                            count += 1  
                                            print("During Loop:{}".format(name))  
  
                                # Adds it to a dictionary so you can modify the data later or save it  
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}  
                                # Seperate 1 email from another   
                                print("-"*250+"\n\n")  
    # Prints all of the content  
    print(raw_emails)  
  
    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.  
    tmpEmails = raw_emails  
    tmpEmails = str(tmpEmails).replace('"', '|')  
    tmpEmails = str(tmpEmails).replace("'", '"')  
    tmpEmails = str(tmpEmails).replace("|", "'")  
  
    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument  
    # with open("emails.json", "w") as f:  
    #     f.write(tmpEmails)  
  
    # Saves the emails to a text file.  
    with open("emails.txt", "w") as f:  
        for key, value in raw_emails.items():  
            f.write("ID: {}\n".format(key))  
            f.write("Subject: {}\n".format(value["subject"]))  
            f.write("Sender: {}\n".format(value["sender"]))  
            f.write("Recieved: {}\n".format(value["received"]))  
            f.write("Unread: {}\n".format(value["unread"]))  
            try:  
                f.write("Body:\n{}\n".format(value["body"]))  
            except UnicodeEncodeError as e:  
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))  
            f.write("-"*250+"\n\n")  
  
    print("Finished Succesfully")  
    return raw_emails  
  
  
def main():  
    accounts, outlook = init()  
    emails = getEmails(accounts, outlook)  
    print(emails)  
  
if __name__ == "__main__":  
    main()

Limitations:

While I am personally extremely happy with this functionality, it is not without its flaws. As mentioned in the previous PyAutoGui article, this program requires you to not use your system during the time that it runs. This causes massive scaling issues…

Once a day before work starts? That is fine.

Another time while you are out at lunch? Also fine.

Checking every 30 minutes or every hour while working? That is an issue…

If you need something in real time for updates, you will need another system that you are not actively using for that. This can be achieved via a dedicated Windows server or a windows Virtual Machine however.

Another interesting limitation is saving the emails when non basic latin characters are present in the emails. This caused my original program to get side tracked for roughly 2 hours while I was trying to sanitize a kanji email signature…

In the end, I opted to have the entire email encoded to utf-8. In theory, you can spend time calculating when the non latin characters start and when they end. After that, you can encode just those characters and have the rest of the email saved in their native format.


Why does this matter and why would I need this?

If you have gotten this far, I have to commend you on reading this far! When I originally talked to my team and family about this idea, I was instantly questioned about it since reading an email isn’t that hard. I always had to explain to them the potential use cases for something like this.

Do you want to upload every email to Jira so that you can have the information in a ticket for other analysts/testers/managers to see the entire chain?

Do you want to parse every email into a database so that you have a more in-depth knowledge base for a chatbot to respond with?

Do you want to send an email with through a Jira mail server or would you prefer to send an email from a bot as if it was yourself?

These reasons (along with a few more client specific reasons) are why I spent way too much time trying to figure out how to do everything listed in the two programs above. Below I have included the code in their final forms.


If you get this far, thank you so much for taking the time to read this article on “Using Python to read and save your Outlook emails!”

Until next time, Stay curious and Hack the Planet!


Code:

winBypass.py

import pyautogui  
import logging  
import keyboard  
import time  
import argparse  
import sys  
  
logging.basicConfig(level=logging.INFO)  
  
# Set up logging  
def get_arg():  
    """ Takes nothing  
Purpose: Gets arguments from command line  
Returns: Argument's values  
"""  
    parser = argparse.ArgumentParser()  
    # Information  
    parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn on debugging",default=False)  
    # Functionality  
    parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on finder mode to see coordinates for mouse and colors",default=False)  
  
    options = parser.parse_args()  
    if options.debug:  
        logging.basicConfig(level=logging.DEBUG)  
        global DEBUG  
        DEBUG = True  
    else:  
        logging.basicConfig(level=logging.INFO)  
    return options  
  
def finder():  
    """ Takes nothing  
Purpose: Finds the mouse position and color  
Returns: Nothing  
"""  
    while keyboard.is_pressed('q') != True:  
        if keyboard.is_pressed('c') == True:  
            x, y = pyautogui.position()  
            r,g,b = pyautogui.pixel(x, y)  
  
            logging.info("Mouse position: {}, {}. R: {}. G: {}. B: {}.".format(x, y, r, g, b))  
            logging.info("\twin32api.SetCursorPos(({}, {}))".format(x, y))  
            logging.info("\tpyautogui.pixel({}, {})[0] == {} and pyautogui.pixel({}, {})[1] == {} and pyautogui.pixel({}, {})[2] == {}\n".format(x, y, r, x, y, g, x, y, b))  
            time.sleep(1)  
  
  
def typeWriter(text):  
    """ Takes text  
Purpose: Types out the text  
Returns: Nothing  
"""  
    if text == "ENTER":  
        pyautogui.press('enter')  
    else:  
        pyautogui.typewrite(text)  
        pyautogui.press('enter')  
  
  
def clicker(x,y):  
    """ Takes x and y coordinates  
Purpose: Clicks the location  
Returns: Nothing  
"""  
    pyautogui.click(x,y)  
  
  
def main():  
    options = get_arg()  
    logging.info("Starting program")  
    if options.find:  
        finder()  
        sys.exit(1)  
      
    if pyautogui.pixel(1496, 1434)[0] in range(40,60) and pyautogui.pixel(1496, 1434)[1] in range(40,60) and pyautogui.pixel(1496, 1434)[2] in range(40,60):  
      clicker(1496,1434) # Clicks the loction  
      time.sleep(3) # Wait for the program to load  
      typeWriter("cd testLocation") # Change to a different location  
      typeWriter("ENTER") # Press Enter  
      typeWriter("python emailReader.py") # Run emailReader.py program  
      typeWriter("ENTER") # Press Enter  
      time.sleep(60) # Wait 60 seconds  
      typeWriter("exit") # Close terminal  
      typeWriter("ENTER") # Press Enter  
    else:  
      logging.fatal("Color is not in range!") # Let user know that the color isn't in range  
  
if __name__ == "__main__":  
    main()

emailReader.py

import win32com.client  
import win32com  
import re  
  
EMAILADDRESS = ""  
IGNOREDSENDER = [""]  
  
raw_emails = {}  
  
with open("monitor.txt", "r") as f:  
    lines = f.readlines()  
print(lines)  
  
  
def init():  
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")  
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts  
  
    return accounts, outlook  
  
  
def getEmails(accounts, outlook):  
    """Takes accounts and outlook  
Purpose: Gets emails from outlook  
Returns: Nothing  
"""  
    # Counter used for counting the amount of emails per subject.  
    count = 0  
  
    # Loop through all accounts  
    for account in accounts:  
        # print("Account: {}".format(account))  
  
        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS  
        if str(account).lower() == EMAILADDRESS.lower():  
            print("Account: {}".format(account))  
            folders = outlook.Folders(account.DeliveryStore.DisplayName)  
            specific_folder = folders.Folders  
  
            # Loop through all folders   
            for folder in specific_folder:      
                #Prints the current folder you are in  
                print("Folder: '{}'".format(folder))  
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders  
                if(folder.name == "Inbox"):  
                    messages = folder.Items  
  
                    # Loop through all messages  
                    for single in messages:  
  
                        # Check if the email subject is located in the monitor.txt file  
                        for subject in lines:  
  
                            # If statement looking for only those emails with the subject in the monitor.txt file  
                            if subject.strip() in single.Subject.lower():  
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.  
                                for sender in IGNOREDSENDER:  
                                    try:  
                                        if single.SenderName == sender.lower():  
                                            continue  
                                    except AttributeError:  
                                        pass  
                                # I've found that certain email senders can cause issues with these fields.  
                                try:  
                                    print("Sender: {}".format(single.Sender))  
                                    send = single.Sender  
                                except AttributeError:  
                                    try:  
                                        print("Sender: {}".format(single.SenderName))  
                                        send = single.SenderName  
                                    except AttributeError:  
                                        print("Sender: {}".format(single.SenderEmailAddress))  
                                        send = single.SenderEmailAddress  
  
                                # Prints subject  
                                print("Subject: {}".format(single.Subject))  
                                # Prints when the email was received  
                                print("Received Time: {}".format(single.ReceivedTime))  
                                # Prints if the email is unread or not  
                                print("Unread: {}".format(single.Unread))  
  
                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.  
                                loc = re.search("Confidentiality Notice", single.Body)  
                                emailStart = re.search("From:\s", single.Body)  
  
                                # This is used to show if one of them were found. If they were, it shows the regex match  
                                print("Email Start: {}".format(emailStart))  
                                print("Location: {}".format(loc))  
  
                                # Checks to see which one was found first and uses that as the end of the email body  
                                if emailStart and not loc:  
                                    end = emailStart.start()  
                                elif loc and not emailStart:  
                                    end = loc.start()  
                                elif emailStart and loc:  
                                    end = min(emailStart.start(), loc.start())  
                                else:  
                                    end = None  
  
                                # Captures the body of the email until the established end is found  
                                if end:  
                                    body = single.Body[:end]  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # Captures the entire email. If this is the first email in a chain, then this will trigger.  
                                else:  
                                    body = single.Body  
                                    body = body.replace("\r", "")  
                                    body = body.replace("\n\n", "\n")  
                                    body = body.strip()  
                                    print("Body:\n{}".format(body))  
                                  
                                # This regex is used for tracking the amount of emails pertaining to a specific subject.  
                                regex = re.compile(r"(Regex)")  
                                name = re.findall(regex, str(single.Subject))  
                                if name:  
                                    name = name[0].strip()  
                                    print("Name: {}".format(name))  
                                    if name in raw_emails:  
                                        print("Before Loop:{}".format(name))  
                                        count = int(0)  
                                        while name in raw_emails:  
                                            testName = name + "_" + str(count)  
                                            if testName not in raw_emails:  
                                                name = testName  
                                                print("After Loop:{}".format(name))  
                                                break  
                                            count += 1  
                                            print("During Loop:{}".format(name))  
  
                                # Adds it to a dictionary so you can modify the data later or save it  
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}  
                                # Seperate 1 email from another   
                                print("-"*250+"\n\n")  
    # Prints all of the content  
    print(raw_emails)  
  
    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.  
    tmpEmails = raw_emails  
    tmpEmails = str(tmpEmails).replace('"', '|')  
    tmpEmails = str(tmpEmails).replace("'", '"')  
    tmpEmails = str(tmpEmails).replace("|", "'")  
  
    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument  
    # with open("emails.json", "w") as f:  
    #     f.write(tmpEmails)  
  
    # Saves the emails to a text file.  
    with open("emails.txt", "w") as f:  
        for key, value in raw_emails.items():  
            f.write("ID: {}\n".format(key))  
            f.write("Subject: {}\n".format(value["subject"]))  
            f.write("Sender: {}\n".format(value["sender"]))  
            f.write("Recieved: {}\n".format(value["received"]))  
            f.write("Unread: {}\n".format(value["unread"]))  
            try:  
                f.write("Body:\n{}\n".format(value["body"]))  
            except UnicodeEncodeError as e:  
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))  
            f.write("-"*250+"\n\n")  
  
    print("Finished Succesfully")  
    return raw_emails  
  
  
def main():  
    accounts, outlook = init()  
    emails = getEmails(accounts, outlook)  
    print(emails)  
  
if __name__ == "__main__":  
    main()

That's it for this topic. Thank you for reading.




Continue Learning