Phishing Analysis with Voyant Tools
23 Apr 2023 (1772 Words, 10 Minutes)
Recently I came across a tool called Voyant Tools.
Voyant Tools is a web-based text analysis platform designed to assist researchers in exploring, analyzing, and visualizing digital texts. Developed by Stéfan Sinclair and Geoffrey Rockwell, this suite of tools allows users to investigate patterns, frequencies, and relationships within textual data, making it an invaluable resource for various disciplines such as digital humanities, linguistics, and data-driven journalism.
Although Voyant Tools is not specifically designed for analyzing malicious email files (.eml), it can still provide valuable insights into the structure, content, and patterns present in such files. Researchers can benefit from the following features:
Word frequency analysis: Voyant Tools can generate frequency lists and visualizations, helping users identify common terms and phrases that may be associated with phishing or spam emails.
Keyword-in-context (KWIC): By examining the context in which specific words or phrases appear, researchers can gain insights into the tactics and themes used by cybercriminals to deceive victims.
Collocation analysis: By exploring the relationships between words and phrases, users can identify patterns and connections that might reveal the intentions or strategies employed in malicious emails.
Visualization tools: Voyant Tools offers various visualization options, such as word clouds, network graphs, and trend graphs, allowing researchers to visualize patterns and relationships within the email data in an intuitive manner.
Corpus comparison: By comparing malicious emails with a corpus of legitimate emails, users can identify distinctive features or characteristics that may help in detecting and preventing phishing scams.
It is important to note that while Voyant Tools can provide valuable insights into the textual features of malicious emails, it does not offer specific functionality for analyzing email headers, attachments, or embedded links. As such, researchers should complement their use of Voyant Tools with additional cybersecurity tools and techniques to conduct a comprehensive analysis of potentially harmful emails.
Now let’s talk about Phishing :
Phishing scams are fraudulent activities wherein cybercriminals masquerade as legitimate entities to deceive individuals into revealing sensitive information, such as usernames, passwords, financial details, and personal information. These scams typically rely on communication channels such as email, social media, and text messages to lure victims into clicking malicious links, downloading harmful attachments, or providing confidential data.
The most commonly used themes in phishing scams include:
Financial institutions: Cybercriminals often pose as banks, credit card companies, or investment firms, urging individuals to update their account information or confirm a suspicious transaction.
Tech support: Scammers may pretend to be representatives of well-known technology companies, claiming that there is a problem with the victim’s device or account that requires immediate action.
Tax and government agencies: Phishing attacks may also involve fraudsters impersonating government agencies like the IRS, instructing recipients to resolve urgent tax issues or verify their identity for government benefits.
Social media: Scammers may impersonate friends or followers on social media, asking for financial help, sharing sensational news stories, or promoting enticing giveaways that require personal information.
Shipping and delivery notifications: Fake notifications of package deliveries or shipment delays are used to trick individuals into clicking malicious links or providing sensitive data.
Phishing scams are a serious threat, below are some of the latest statistics emphasizing their danger:
Phishing is the most common form of cyber crime, with an estimated 3.4 billion spam emails sent every day.
According to FBI’s Internet Crime Complaint Center Releases 2022 Statistics ( FBI Internet Crime Report 2021 ): Phishing ranks one in the Internet crimes, with reported 323,972 victims for Phishing/Vishing/Smishing/Pharming. Constituting $2,395,953,296 loss for Business Email Compromise.
The Anti-Phishing Working Group (APWG) reported that in the third quarter of 2021, there were more than 222,000 unique phishing sites detected, marking a 7.3% increase from the previous quarter.
A 2021 study by Proofpoint found that 75% of organizations worldwide had experienced a phishing attack, with 74% of successful attacks leading to data breaches.
Learn more about latest phishing trends in 2023 here.
As phishing scams continue to evolve and target a wide range of industries and individuals, it is crucial to raise awareness and implement robust security measures to protect against these threats. Today we will use Voyant Tools to analyze some of the common and pressing themes in the realm of phishing emails.
One can gain deeper insights into Phishing emails without becoming a victim themselves via using tools and services like CaniPhish - It’s primarily utilized for user-training in the enterprise networks, standalone end-users can benefit from it as well.
Utilizing Voyant Tools for analyzing Phishing
However this is not a rigorous research work, but still I will highlight the overall methodology utilized in this article.
I have found a dataset of common phishing emails (samples) being used in real world malware campaigns and submitted by users / administrators etc respectively. These
.eml files have all the details of the phishing email being received on their endpoints, they contain all the security headers and email contents.
Voyant Tools is smart and it can decode Base64 encoded email contents on its own, also it doesn’t evaluate the security headers and alike in the
.eml files, it automatically focuses on analyzing the main content of the email itself. This saves our time for cleaning a complex data like “original message” of emails, for the reason being here, we are just interested in the Text Analysis of malicious emails. For the safety purposes, these sample email’s real malicious links / sensitive information have been replaced by benign
Remember to anonymize the files hiding information that could identify the address of your Honey Pot. All sensitive information should be replaced with phishing@pot. Sometimes the email address is contained within the content, either in the body of the message or in malicious URL arguments. Be sure to check these fields. If the content is encoded in base64, decode it, change the necessary values, re-encode it in base64 (respecting the indentation).
Above is a transcript from Phishing Pot’s GitHub repo.
Data set used - Phishing Pot
Psychology behind Phishing :
Phishing emails exploit various psychological principles to manipulate victims into divulging sensitive information or performing actions that compromise their security. Some of the key psychological targets employed by cybercriminals include:
Sense of urgency: Phishing emails often create a sense of urgency, pressuring recipients to take immediate action to resolve a problem or claim a reward. This tactic exploits individuals’ natural tendency to prioritize immediate threats or opportunities over more rational decision-making.
Authority: Cybercriminals may impersonate legitimate organizations, government agencies, or well-known individuals to establish a sense of authority. This tactic leverages people’s inclination to comply with requests from perceived authority figures, even if the requests are unusual or suspicious.
Curiosity: Scammers may use clickbait headlines or provocative content to pique recipients’ curiosity, enticing them to click on malicious links or download harmful attachments.
Fear: Phishing emails often evoke fear by warning recipients about potential security breaches, legal issues, or financial losses. By exploiting people’s instinctive desire to avoid negative consequences, cybercriminals can manipulate them into providing sensitive information or clicking on harmful links.
Greed: Scammers may promise financial gains, exclusive deals, or valuable rewards to lure victims into sharing personal information or making unwise decisions. This tactic capitalizes on people’s innate desire for wealth and success.
Social proof: Phishing emails may include fabricated testimonials, endorsements, or social media shares to establish credibility and make the scam appear more legitimate. This tactic exploits people’s tendency to rely on others’ opinions and experiences when making decisions.
Reciprocity: Some phishing attacks use the principle of reciprocity, offering a small favor or gift to create a sense of obligation in the recipient. This tactic can make people more likely to comply with the scammer’s requests, as they feel indebted to return the favor.
Familiarity: Phishing emails may appear to come from a known contact or mimic the visual style and language of legitimate organizations. This tactic exploits people’s trust in familiar sources and lowers their defenses against potential threats.
By understanding and recognizing these psychological targets, individuals can become more vigilant and better equipped to identify and avoid falling victim to phishing scams.
Using Voyant Tools we will attempt to identify and understand some of the most common themes in the realm of Phishing. I have chosen 11 random malicious
.eml files for analysis.
Last day to claim your exclusive offer / reward type phishing scams.
Phishing emails like these create a sense of curiosity and intrigues the victim to try their schemes, attempting to log into their said wallets, clicking their malicious links or opening the malicious attachments etc…
Bitcoin and Cryptocurrency themed phishing scams, would often offer some Bitcoins, not much just 75… enticing isn’t it?!
McAfee subscription maybe Terminated, extend it… creates a sense of urgency and authority, as we will see in some other phishing emails here.
No attachment was found associated with this malicious email, but usually they have either malicious link or attachment. Theme is simple - an innocent looking initiation from Bank, a legit banking fraud.
Moving ahead with modern themes, we have NFTs (OpenSea) scam.
This one has a malicious link attached in its email body, which is already flagged as malicious by other security vendors. Theme is based on “money withdrawal” or some amount of money residing in your said wallet / account they have decided upon.
An enticing social proof themed phishing email, revolving around Food, Diet planning and exclusivity to join their program/subscription.
Classic security email phishing scam - FaceBook : someone tried to log into your account.
Charming Russian girls…
KYC Wallet verification scam, often these variations of phishing emails leverage the sense of urgency and authority.
I hope end users gain a deeper insight into the inner body and its contents of typical phishing emails being used in real world malware campaigns. Using Voyant Tools one can tweak and play around with the analysis above. Being aware and cautious would put us in a safer zone, the inbuilt spam filters available in the mailboxes (mostly) are effective but attackers evolve over time and they bypass that often. So identifying what’s in your MailBox and differentiating it from benign ones will keep you safe.