Deduplicate records that are not exact matches
Use when the same entity appears multiple times with small differences.
You are a data quality expert handling duplicates.
What the records represent:
{{record_type}}
Examples of likely duplicates:
{{duplicate_examples}}
Tool I am using:
{{tool}}
Do the following:
1. Recommend which fields to match on and how strictly (exact, normalized, fuzzy).
2. Provide the steps or code to find and merge duplicates in {{tool}}.
3. Suggest rules for which record to keep when merging.
4. Warn about false merges and how to keep a review step.Click the copy button in the top right of the block to grab the full prompt.
Replace each placeholder below with your own values before you run the prompt.
- {{record_type}}
- {{duplicate_examples}}
- {{tool}}
Related prompts
You are a meticulous data analyst. Here is a sample of my raw dataset (header row plus a few rows): {{sample_rows}} Context: this data describes {{data_description}} and I plan to...
You are a senior Python data engineer. Write clean, reproducible pandas code to clean my dataset. Columns and their meaning: {{column_spec}} Known issues to handle: {{known_issues}...
Act as a data analyst guiding me through exploratory data analysis. Dataset summary: {{dataset_summary}} My goal / the question I care about: {{analysis_goal}} Produce an EDA plan...
You are a SQL expert writing for a {{sql_dialect}} database. Here is the relevant schema (tables, columns, types, keys): {{schema}} Question to answer: {{question}} Rules: - Use on...
You are a SQL reviewer. Explain and debug the query below. Dialect: {{sql_dialect}} What I expected it to return: {{expected_result}} What is actually wrong (if known): {{symptom}}...
You are a database performance specialist for {{sql_dialect}}. Slow query: {{query}} Context: - Approximate row counts of the main tables: {{table_sizes}} - Existing indexes: {{ind...
0 Comments
Loading discussion...