Power Query: Data Cleaning with a Comprehensive Guide for From Text section of Add Column
- Fakhriddinbek
- 7 hours ago
- 4 min read
In the world of data, the old saying holds true: garbage in, garbage out. No matter how powerful your analysis or how beautiful your report, if your underlying data is messy, your insights will be flawed. This is where Power Query shines. It's not just a tool for connecting to data sources; it’s a robust engine for cleaning and transforming your data.
While many of Power Query's features are designed for structural changes, the Power Query: Data Cleaning with a Comprehensive Guide for From Text section of Add Column and its counterparts are where you do the detailed work of standardizing and correcting your data at a granular level. This guide will walk you through the essential transformations for text, numbers, and dates, showing you how to turn your messy raw data into a clean, trustworthy dataset ready for analysis.

The Art of Cleaning Text Data with Power Query: Data Cleaning with a Comprehensive Guide for From Text section of Add Column
Text data is often the most inconsistent part of a dataset. Typos, varying capitalization, and extra characters can all cause problems when you try to filter or group your information. The From Text section of the Add Column tab provides a suite of tools to fix these issues without writing any code.
1. Extracting Specific Information
You often don't need an entire text string—you just need a small piece of it. The Extract function is your go-to for this.
Length: Pull a specific number of characters from the beginning or end of a string. This is useful for extracting a product code that is always a fixed number of characters long.
Text Before/After Delimiter: This is a lifesaver when you need to grab text that is separated by a specific character (the "delimiter"). For instance, you can extract a user's name from an email address by using the @ symbol as the delimiter.
Text Between Delimiters: Perfect for pulling a value that's sandwiched between two characters, like a transaction ID between parentheses.
By using these simple functions, you can normalize inconsistent text and create new columns that contain only the data you need for analysis.
2. Standardizing Text with Formatting
Varying capitalization can make it seem like "apples," "Apples," and "APPLES" are all different products. The Format options make it easy to standardize your text.
Lowercase/Uppercase: Convert all text to a consistent case. It's best practice to convert all categorical text columns (like product names or customer regions) to lowercase to ensure they group correctly in a PivotTable or report.
Capitalize Each Word: This is great for making names and titles look professional.
Transforming Numbers for Accurate Calculations
Numbers in the wrong format can break your entire analysis. If a column of numbers is saved as "text," Power Query won't be able to sum or average it. The From Number section handles these issues with a few clicks.
Standard & Scientific Operations: Need to add a sales tax, calculate a discount, or find the square root of a value? Power Query can perform all standard mathematical operations, creating a new column with the result.
Rounding: Clean up messy decimal values by rounding up, down, or to the nearest whole number. This is essential for financial reporting and other scenarios where precision is required.
Conversion: A key task is converting text that looks like a number (e.g., "123.45") into an actual number data type. The Data Type dropdown in the Home tab is your first stop for this, but the From Number options add more flexibility.
Mastering Date and Time Data
Dates and times are the backbone of time-based analysis. They can tell you about sales trends, seasonal patterns, and project timelines. But like other data types, they need to be in a consistent format. The From Date & Time section makes this effortless.
Extraction: You can easily pull out key components from a date column, such as the Year, Month, Day, or even the Hour of a transaction. This allows you to group data by month or year to spot trends.
Age and Duration: Need to calculate the number of days a ticket has been open or an employee has been with the company? The Age feature can calculate the duration between a date in your table and the current date.
Week of Year/Month: This is incredibly useful for creating weekly or monthly reports. You can create a column that shows the week number or month name, making it simple to build a PivotTable that summarizes data over time.
Putting It All Together: A Practical Example
Imagine you have a sales table with a column called "OrderDate" and another called "OrderNotes."
Clean the text: You notice the "OrderNotes" column sometimes contains a SKU number. You can use Extract > Text After Delimiter with a specific keyword to pull out the SKU into a new, clean column. Then, you can use Format > Capitalize Each Word to make sure the notes are consistently formatted.
Clean the dates: The "OrderDate" column might be a text string. You'll convert it to a Date data type first. Then, you can use Date > Year to create a new "OrderYear" column and Date > Month to create an "OrderMonth" column. This allows you to easily analyze sales trends by year and month.
Create a custom column: You can use the Custom Column feature (found in the General section) to create a "Sales Period" column that says "Q1" or "Q2" based on the month number.
By using these targeted transformations, you've turned a raw table into a clean, structured dataset that can be used for deep, meaningful analysis.
Mastering these data cleaning techniques in Power Query is a fundamental step toward becoming a data professional. It's the difference between struggling with messy data and building powerful, reliable reports with confidence.
Comments