Using Python to Read and Save Your Outlook Emails!

Utilizing PyAutoGUI and win32com to read Outlook emails and save them to your system.

Published on

Use Case:

Downloading emails from Outlook and storing them so that you can trigger other processes or securely back up emails for auditing purposes.


Getting started:

  • Python3
  • Pip3
  • pip3 install pyautogui
  • pip3 install keyboard
  • pip3 install pywin32
  • Periods of time where you are actively using your computer
  • Windows Operating System

Setup:

Install python along with PyAutoGui and pywin32. This specific use case and program can only be used on Windows as we are accessing the via a Windows Component Object Model (COM). This article is assuming you’ve read my previous article here and would like to expand upon that idea (or you were curious what emailReader.py was).


Part 1. PyAutoGUI:

The first portion of this is getting your clicking set up. This is going to be extremely important as Window’s COM does not allow you to interact with it via a scheduled task easily. Using baseBot.py found here, set up a series of clicks to

  1. Open a terminal
  2. Go to the program’s directory
  3. Run a python program
  4. Wait for a set amount of time (I usually do 30 seconds to 1 minute depending on the program and it’s average time to run.)
  5. Close the program

💡 Speed up your blog creation with DifferAI.

Available for free exclusively on the free and open blogging platform, Differ.

To save time and to reuse perfectly good code, I will use the code at the end of the PyAutoGui article as a starting point for this code.

import pyautogui
import logging
import keyboard
import time
import argparse
import sys

logging.basicConfig(level=logging.INFO)

# Set up logging
def get_arg():
    """ Takes nothing
Purpose: Gets arguments from command line
Returns: Argument's values
"""
    parser = argparse.ArgumentParser()
    # Information
    parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn on debugging",default=False)
    # Functionality
    parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on finder mode to see coordinates for mouse and colors",default=False)

    options = parser.parse_args()
    if options.debug:
        logging.basicConfig(level=logging.DEBUG)
        global DEBUG
        DEBUG = True
    else:
        logging.basicConfig(level=logging.INFO)
    return options

def finder():
    """ Takes nothing
Purpose: Finds the mouse position and color
Returns: Nothing
"""
    while keyboard.is_pressed('q') != True:
        if keyboard.is_pressed('c') == True:
            x, y = pyautogui.position()
            r,g,b = pyautogui.pixel(x, y)

            logging.info("Mouse position: {}, {}. R: {}. G: {}. B: {}.".format(x, y, r, g, b))
            logging.info("\twin32api.SetCursorPos(({}, {}))".format(x, y))
            logging.info("\tpyautogui.pixel({}, {})[0] == {} and pyautogui.pixel({}, {})[1] == {} and pyautogui.pixel({}, {})[2] == {}\n".format(x, y, r, x, y, g, x, y, b))
            time.sleep(1)


def typeWriter(text):
    """ Takes text
Purpose: Types out the text
Returns: Nothing
"""
    if text == "ENTER":
        pyautogui.press('enter')
    else:
        pyautogui.typewrite(text)
        pyautogui.press('enter')


def clicker(x,y):
    """ Takes x and y coordinates
Purpose: Clicks the location
Returns: Nothing
"""
    pyautogui.click(x,y)


def main():
    options = get_arg()
    logging.info("Starting program")
    if options.find:
        finder()
        sys.exit(1)

    if pyautogui.pixel(1496, 1434)[0] in range(40,60) and pyautogui.pixel(1496, 1434)[1] in range(40,60) and pyautogui.pixel(1496, 1434)[2] in range(40,60):
      clicker(1496,1434) # Clicks the loction
      time.sleep(3) # Wait for the program to load
      typeWriter("cd testLocation") # Change to a different location
      typeWriter("ENTER") # Press Enter
      typeWriter("python emailReader.py") # Run emailReader.py program
      typeWriter("ENTER") # Press Enter
      time.sleep(60) # Wait 60 seconds
      typeWriter("exit") # Close terminal
      typeWriter("ENTER") # Press Enter
    else:
      logging.fatal("Color is not in range!") # Let user know that the color isn't in range

if __name__ == "__main__":
    main()

The above code s going to be the program that we call via a scheduled task. While I am going to separate the code so that I can keep things more organized, there is nothing wrong with adding an if statement to the above code and making another argument call the win32com functionality we are about to code.


Part 2. win32com:

This is where the more interesting part happens (and where we actually get to exploit bypassing win32com’s restrictions).

Starting off, I like using this as my boilerplate:

import win32com.client
import win32com
import re

EMAILADDRESS = ""
IGNOREDSENDER = [""]

raw_emails = {}

with open("monitor.txt", "r") as f:
    lines = f.readlines()
print(lines)


def main():
    accounts, outlook = init()
    emails = getEmails(accounts, outlook)
    print(emails)

if __name__ == "__main__":
    main()

The email address will be used for your email. Helpful if there are multiple accounts on your system but only want to scrape one of them. Ignored Sender is amazing if you have an automated system that sends emails that you do not want this program to interact with. Monitor.txt is extremely useful if you only care about emails with a certain subject line.

The next portion is the initialization portion. This is extremely short for this use case but can get more complex if you use different COM systems potentially.

def init():
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts

    return accounts, outlook

The “final” part is the actual meat and potatoes of this program. Since there are many portions that do a lot of things and it can get extremely confusing, I have made a lot of comments that explain what the line does under it if I do not think it is completely obvious.

def getEmails(accounts, outlook):
    """Takes accounts and outlook
Purpose: Gets emails from outlook
Returns: Nothing
"""
    # Counter used for counting the amount of emails per subject.
    count = 0

    # Loop through all accounts
    for account in accounts:
        # print("Account: {}".format(account))

        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS
        if str(account).lower() == EMAILADDRESS.lower():
            print("Account: {}".format(account))
            folders = outlook.Folders(account.DeliveryStore.DisplayName)
            specific_folder = folders.Folders

            # Loop through all folders
            for folder in specific_folder:
                #Prints the current folder you are in
                print("Folder: '{}'".format(folder))
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders
                if(folder.name == "Inbox"):
                    messages = folder.Items

                    # Loop through all messages
                    for single in messages:

                        # Check if the email subject is located in the monitor.txt file
                        for subject in lines:

                            # If statement looking for only those emails with the subject in the monitor.txt file
                            if subject.strip() in single.Subject.lower():
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.
                                for sender in IGNOREDSENDER:
                                    try:
                                        if single.SenderName == sender.lower():
                                            continue
                                    except AttributeError:
                                        pass
                                # I've found that certain email senders can cause issues with these fields.
                                try:
                                    print("Sender: {}".format(single.Sender))
                                    send = single.Sender
                                except AttributeError:
                                    try:
                                        print("Sender: {}".format(single.SenderName))
                                        send = single.SenderName
                                    except AttributeError:
                                        print("Sender: {}".format(single.SenderEmailAddress))
                                        send = single.SenderEmailAddress

                                # Prints subject
                                print("Subject: {}".format(single.Subject))
                                # Prints when the email was received
                                print("Received Time: {}".format(single.ReceivedTime))
                                # Prints if the email is unread or not
                                print("Unread: {}".format(single.Unread))

                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.
                                loc = re.search("Confidentiality Notice", single.Body)
                                emailStart = re.search("From:\s", single.Body)

                                # This is used to show if one of them were found. If they were, it shows the regex match
                                print("Email Start: {}".format(emailStart))
                                print("Location: {}".format(loc))

                                # Checks to see which one was found first and uses that as the end of the email body
                                if emailStart and not loc:
                                    end = emailStart.start()
                                elif loc and not emailStart:
                                    end = loc.start()
                                elif emailStart and loc:
                                    end = min(emailStart.start(), loc.start())
                                else:
                                    end = None

                                # Captures the body of the email until the established end is found
                                if end:
                                    body = single.Body[:end]
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # Captures the entire email. If this is the first email in a chain, then this will trigger.
                                else:
                                    body = single.Body
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # This regex is used for tracking the amount of emails pertaining to a specific subject.
                                regex = re.compile(r"(Regex)")
                                name = re.findall(regex, str(single.Subject))
                                if name:
                                    name = name[0].strip()
                                    print("Name: {}".format(name))
                                    if name in raw_emails:
                                        print("Before Loop:{}".format(name))
                                        count = int(0)
                                        while name in raw_emails:
                                            testName = name + "_" + str(count)
                                            if testName not in raw_emails:
                                                name = testName
                                                print("After Loop:{}".format(name))
                                                break
                                            count += 1
                                            print("During Loop:{}".format(name))

                                # Adds it to a dictionary so you can modify the data later or save it
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}
                                # Seperate 1 email from another
                                print("-"*250+"\n\n")
    # Prints all of the content
    print(raw_emails)

    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.
    tmpEmails = raw_emails
    tmpEmails = str(tmpEmails).replace('"', '|')
    tmpEmails = str(tmpEmails).replace("'", '"')
    tmpEmails = str(tmpEmails).replace("|", "'")

    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument
    # with open("emails.json", "w") as f:
    #     f.write(tmpEmails)

    # Saves the emails to a text file.
    with open("emails.txt", "w") as f:
        for key, value in raw_emails.items():
            f.write("ID: {}\n".format(key))
            f.write("Subject: {}\n".format(value["subject"]))
            f.write("Sender: {}\n".format(value["sender"]))
            f.write("Recieved: {}\n".format(value["received"]))
            f.write("Unread: {}\n".format(value["unread"]))
            try:
                f.write("Body:\n{}\n".format(value["body"]))
            except UnicodeEncodeError as e:
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))
            f.write("-"*250+"\n\n")

    print("Finished Succesfully")
    return raw_emails

The final program will look like this:

import win32com.client
import win32com
import re

EMAILADDRESS = ""
IGNOREDSENDER = [""]

raw_emails = {}

with open("monitor.txt", "r") as f:
    lines = f.readlines()
print(lines)


def init():
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts

    return accounts, outlook


def getEmails(accounts, outlook):
    """Takes accounts and outlook
Purpose: Gets emails from outlook
Returns: Nothing
"""
    # Counter used for counting the amount of emails per subject.
    count = 0

    # Loop through all accounts
    for account in accounts:
        # print("Account: {}".format(account))

        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS
        if str(account).lower() == EMAILADDRESS.lower():
            print("Account: {}".format(account))
            folders = outlook.Folders(account.DeliveryStore.DisplayName)
            specific_folder = folders.Folders

            # Loop through all folders
            for folder in specific_folder:
                #Prints the current folder you are in
                print("Folder: '{}'".format(folder))
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders
                if(folder.name == "Inbox"):
                    messages = folder.Items

                    # Loop through all messages
                    for single in messages:

                        # Check if the email subject is located in the monitor.txt file
                        for subject in lines:

                            # If statement looking for only those emails with the subject in the monitor.txt file
                            if subject.strip() in single.Subject.lower():
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.
                                for sender in IGNOREDSENDER:
                                    try:
                                        if single.SenderName == sender.lower():
                                            continue
                                    except AttributeError:
                                        pass
                                # I've found that certain email senders can cause issues with these fields.
                                try:
                                    print("Sender: {}".format(single.Sender))
                                    send = single.Sender
                                except AttributeError:
                                    try:
                                        print("Sender: {}".format(single.SenderName))
                                        send = single.SenderName
                                    except AttributeError:
                                        print("Sender: {}".format(single.SenderEmailAddress))
                                        send = single.SenderEmailAddress

                                # Prints subject
                                print("Subject: {}".format(single.Subject))
                                # Prints when the email was received
                                print("Received Time: {}".format(single.ReceivedTime))
                                # Prints if the email is unread or not
                                print("Unread: {}".format(single.Unread))

                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.
                                loc = re.search("Confidentiality Notice", single.Body)
                                emailStart = re.search("From:\s", single.Body)

                                # This is used to show if one of them were found. If they were, it shows the regex match
                                print("Email Start: {}".format(emailStart))
                                print("Location: {}".format(loc))

                                # Checks to see which one was found first and uses that as the end of the email body
                                if emailStart and not loc:
                                    end = emailStart.start()
                                elif loc and not emailStart:
                                    end = loc.start()
                                elif emailStart and loc:
                                    end = min(emailStart.start(), loc.start())
                                else:
                                    end = None

                                # Captures the body of the email until the established end is found
                                if end:
                                    body = single.Body[:end]
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # Captures the entire email. If this is the first email in a chain, then this will trigger.
                                else:
                                    body = single.Body
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # This regex is used for tracking the amount of emails pertaining to a specific subject.
                                regex = re.compile(r"(Regex)")
                                name = re.findall(regex, str(single.Subject))
                                if name:
                                    name = name[0].strip()
                                    print("Name: {}".format(name))
                                    if name in raw_emails:
                                        print("Before Loop:{}".format(name))
                                        count = int(0)
                                        while name in raw_emails:
                                            testName = name + "_" + str(count)
                                            if testName not in raw_emails:
                                                name = testName
                                                print("After Loop:{}".format(name))
                                                break
                                            count += 1
                                            print("During Loop:{}".format(name))

                                # Adds it to a dictionary so you can modify the data later or save it
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}
                                # Seperate 1 email from another
                                print("-"*250+"\n\n")
    # Prints all of the content
    print(raw_emails)

    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.
    tmpEmails = raw_emails
    tmpEmails = str(tmpEmails).replace('"', '|')
    tmpEmails = str(tmpEmails).replace("'", '"')
    tmpEmails = str(tmpEmails).replace("|", "'")

    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument
    # with open("emails.json", "w") as f:
    #     f.write(tmpEmails)

    # Saves the emails to a text file.
    with open("emails.txt", "w") as f:
        for key, value in raw_emails.items():
            f.write("ID: {}\n".format(key))
            f.write("Subject: {}\n".format(value["subject"]))
            f.write("Sender: {}\n".format(value["sender"]))
            f.write("Recieved: {}\n".format(value["received"]))
            f.write("Unread: {}\n".format(value["unread"]))
            try:
                f.write("Body:\n{}\n".format(value["body"]))
            except UnicodeEncodeError as e:
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))
            f.write("-"*250+"\n\n")

    print("Finished Succesfully")
    return raw_emails


def main():
    accounts, outlook = init()
    emails = getEmails(accounts, outlook)
    print(emails)

if __name__ == "__main__":
    main()

Limitations:

While I am personally extremely happy with this functionality, it is not without its flaws. As mentioned in the previous PyAutoGui article, this program requires you to not use your system during the time that it runs. This causes massive scaling issues…

Once a day before work starts? That is fine.

Another time while you are out at lunch? Also fine.

Checking every 30 minutes or every hour while working? That is an issue…

If you need something in real time for updates, you will need another system that you are not actively using for that. This can be achieved via a dedicated Windows server or a windows Virtual Machine however.

Another interesting limitation is saving the emails when non basic latin characters are present in the emails. This caused my original program to get side tracked for roughly 2 hours while I was trying to sanitize a kanji email signature…

In the end, I opted to have the entire email encoded to utf-8. In theory, you can spend time calculating when the non latin characters start and when they end. After that, you can encode just those characters and have the rest of the email saved in their native format.


Why does this matter and why would I need this?

If you have gotten this far, I have to commend you on reading this far! When I originally talked to my team and family about this idea, I was instantly questioned about it since reading an email isn’t that hard. I always had to explain to them the potential use cases for something like this.

Do you want to upload every email to Jira so that you can have the information in a ticket for other analysts/testers/managers to see the entire chain?

Do you want to parse every email into a database so that you have a more in-depth knowledge base for a chatbot to respond with?

Do you want to send an email with through a Jira mail server or would you prefer to send an email from a bot as if it was yourself?

These reasons (along with a few more client specific reasons) are why I spent way too much time trying to figure out how to do everything listed in the two programs above. Below I have included the code in their final forms.


If you get this far, thank you so much for taking the time to read this article on “Using Python to read and save your Outlook emails!”

Until next time, Stay curious and Hack the Planet!


Code:

winBypass.py

import pyautogui
import logging
import keyboard
import time
import argparse
import sys

logging.basicConfig(level=logging.INFO)

# Set up logging
def get_arg():
    """ Takes nothing
Purpose: Gets arguments from command line
Returns: Argument's values
"""
    parser = argparse.ArgumentParser()
    # Information
    parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn on debugging",default=False)
    # Functionality
    parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on finder mode to see coordinates for mouse and colors",default=False)

    options = parser.parse_args()
    if options.debug:
        logging.basicConfig(level=logging.DEBUG)
        global DEBUG
        DEBUG = True
    else:
        logging.basicConfig(level=logging.INFO)
    return options

def finder():
    """ Takes nothing
Purpose: Finds the mouse position and color
Returns: Nothing
"""
    while keyboard.is_pressed('q') != True:
        if keyboard.is_pressed('c') == True:
            x, y = pyautogui.position()
            r,g,b = pyautogui.pixel(x, y)

            logging.info("Mouse position: {}, {}. R: {}. G: {}. B: {}.".format(x, y, r, g, b))
            logging.info("\twin32api.SetCursorPos(({}, {}))".format(x, y))
            logging.info("\tpyautogui.pixel({}, {})[0] == {} and pyautogui.pixel({}, {})[1] == {} and pyautogui.pixel({}, {})[2] == {}\n".format(x, y, r, x, y, g, x, y, b))
            time.sleep(1)


def typeWriter(text):
    """ Takes text
Purpose: Types out the text
Returns: Nothing
"""
    if text == "ENTER":
        pyautogui.press('enter')
    else:
        pyautogui.typewrite(text)
        pyautogui.press('enter')


def clicker(x,y):
    """ Takes x and y coordinates
Purpose: Clicks the location
Returns: Nothing
"""
    pyautogui.click(x,y)


def main():
    options = get_arg()
    logging.info("Starting program")
    if options.find:
        finder()
        sys.exit(1)

    if pyautogui.pixel(1496, 1434)[0] in range(40,60) and pyautogui.pixel(1496, 1434)[1] in range(40,60) and pyautogui.pixel(1496, 1434)[2] in range(40,60):
      clicker(1496,1434) # Clicks the loction
      time.sleep(3) # Wait for the program to load
      typeWriter("cd testLocation") # Change to a different location
      typeWriter("ENTER") # Press Enter
      typeWriter("python emailReader.py") # Run emailReader.py program
      typeWriter("ENTER") # Press Enter
      time.sleep(60) # Wait 60 seconds
      typeWriter("exit") # Close terminal
      typeWriter("ENTER") # Press Enter
    else:
      logging.fatal("Color is not in range!") # Let user know that the color isn't in range

if __name__ == "__main__":
    main()

emailReader.py

import win32com.client
import win32com
import re

EMAILADDRESS = ""
IGNOREDSENDER = [""]

raw_emails = {}

with open("monitor.txt", "r") as f:
    lines = f.readlines()
print(lines)


def init():
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts

    return accounts, outlook


def getEmails(accounts, outlook):
    """Takes accounts and outlook
Purpose: Gets emails from outlook
Returns: Nothing
"""
    # Counter used for counting the amount of emails per subject.
    count = 0

    # Loop through all accounts
    for account in accounts:
        # print("Account: {}".format(account))

        # This is used if there are more than 1 account in outlook. If there are not, you will need to either remove the if statement and lower the indention for all statements in the if statement or add your email to EMAILADDRESS
        if str(account).lower() == EMAILADDRESS.lower():
            print("Account: {}".format(account))
            folders = outlook.Folders(account.DeliveryStore.DisplayName)
            specific_folder = folders.Folders

            # Loop through all folders
            for folder in specific_folder:
                #Prints the current folder you are in
                print("Folder: '{}'".format(folder))
                # Restricts the program to only check this folder. Useful if you are wanting to only check 1 location versus all of your folders
                if(folder.name == "Inbox"):
                    messages = folder.Items

                    # Loop through all messages
                    for single in messages:

                        # Check if the email subject is located in the monitor.txt file
                        for subject in lines:

                            # If statement looking for only those emails with the subject in the monitor.txt file
                            if subject.strip() in single.Subject.lower():
                                # Skipping emails that are from the ignored senders. This is useful if you have an automated system is sending emails about something you are monitoring.
                                for sender in IGNOREDSENDER:
                                    try:
                                        if single.SenderName == sender.lower():
                                            continue
                                    except AttributeError:
                                        pass
                                # I've found that certain email senders can cause issues with these fields.
                                try:
                                    print("Sender: {}".format(single.Sender))
                                    send = single.Sender
                                except AttributeError:
                                    try:
                                        print("Sender: {}".format(single.SenderName))
                                        send = single.SenderName
                                    except AttributeError:
                                        print("Sender: {}".format(single.SenderEmailAddress))
                                        send = single.SenderEmailAddress

                                # Prints subject
                                print("Subject: {}".format(single.Subject))
                                # Prints when the email was received
                                print("Received Time: {}".format(single.ReceivedTime))
                                # Prints if the email is unread or not
                                print("Unread: {}".format(single.Unread))

                                # This is used to get the body of the email. It will look for the first instance of "From:" or "Confidentiality Notice" and use that as the end of the email body.
                                loc = re.search("Confidentiality Notice", single.Body)
                                emailStart = re.search("From:\s", single.Body)

                                # This is used to show if one of them were found. If they were, it shows the regex match
                                print("Email Start: {}".format(emailStart))
                                print("Location: {}".format(loc))

                                # Checks to see which one was found first and uses that as the end of the email body
                                if emailStart and not loc:
                                    end = emailStart.start()
                                elif loc and not emailStart:
                                    end = loc.start()
                                elif emailStart and loc:
                                    end = min(emailStart.start(), loc.start())
                                else:
                                    end = None

                                # Captures the body of the email until the established end is found
                                if end:
                                    body = single.Body[:end]
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # Captures the entire email. If this is the first email in a chain, then this will trigger.
                                else:
                                    body = single.Body
                                    body = body.replace("\r", "")
                                    body = body.replace("\n\n", "\n")
                                    body = body.strip()
                                    print("Body:\n{}".format(body))

                                # This regex is used for tracking the amount of emails pertaining to a specific subject.
                                regex = re.compile(r"(Regex)")
                                name = re.findall(regex, str(single.Subject))
                                if name:
                                    name = name[0].strip()
                                    print("Name: {}".format(name))
                                    if name in raw_emails:
                                        print("Before Loop:{}".format(name))
                                        count = int(0)
                                        while name in raw_emails:
                                            testName = name + "_" + str(count)
                                            if testName not in raw_emails:
                                                name = testName
                                                print("After Loop:{}".format(name))
                                                break
                                            count += 1
                                            print("During Loop:{}".format(name))

                                # Adds it to a dictionary so you can modify the data later or save it
                                raw_emails[name] = {"body": body.strip(), "subject": str(single.Subject).strip(), "sender": str(send).strip(), "received": str(single.ReceivedTime), "unread": single.Unread}
                                # Seperate 1 email from another
                                print("-"*250+"\n\n")
    # Prints all of the content
    print(raw_emails)

    # Converts the dictionary to a json file. Also replaces the single quotes with double quotes. This is needed for the json file to be read properly by other programs.
    tmpEmails = raw_emails
    tmpEmails = str(tmpEmails).replace('"', '|')
    tmpEmails = str(tmpEmails).replace("'", '"')
    tmpEmails = str(tmpEmails).replace("|", "'")

    # Uncomment if you want it saved as a json file. You can also make this as a flag from an argument
    # with open("emails.json", "w") as f:
    #     f.write(tmpEmails)

    # Saves the emails to a text file.
    with open("emails.txt", "w") as f:
        for key, value in raw_emails.items():
            f.write("ID: {}\n".format(key))
            f.write("Subject: {}\n".format(value["subject"]))
            f.write("Sender: {}\n".format(value["sender"]))
            f.write("Recieved: {}\n".format(value["received"]))
            f.write("Unread: {}\n".format(value["unread"]))
            try:
                f.write("Body:\n{}\n".format(value["body"]))
            except UnicodeEncodeError as e:
                f.write("Body:\n{}\n".format("{}".format(str(value["body"].encode("utf-8")))))
            f.write("-"*250+"\n\n")

    print("Finished Succesfully")
    return raw_emails


def main():
    accounts, outlook = init()
    emails = getEmails(accounts, outlook)
    print(emails)

if __name__ == "__main__":
    main()

That's it for this topic. Thank you for reading.

Enjoyed this article?

Share it with your network to help others discover it

Continue Learning

Discover more articles on similar topics