Data AnalysisOutlier handling

Pick an outlier handling strategy for a column

Use when extreme values are distorting your analysis and you must decide whether to keep, cap or drop them.

The prompt
prompt.txt
You are a careful statistician.

Column: {{column_name}} from {{dataset_description}}.
Here is the distribution summary: {{distribution_summary}}.
The downstream use is: {{downstream_use}}.

Help me handle outliers without throwing away signal:
1. Suggest detection methods suited to this distribution (z-score, IQR, modified z-score, isolation forest) and which fits best.
2. For each candidate, say what it would flag here.
3. Recommend keep vs winsorize vs transform vs drop, tied to the downstream use.
4. Give {{tool}} code for the recommended approach.
5. Remind me how to document the decision so it is reproducible.

Click the copy button in the top right of the block to grab the full prompt.

Variables

Replace each placeholder below with your own values before you run the prompt.

  • {{column_name}}
  • {{dataset_description}}
  • {{distribution_summary}}
  • {{downstream_use}}
  • {{tool}}
Recommended models
Claude Opus 4.8GPT-5Gemini 2.5 Pro
Tags
#outliers#statistics#data-cleaning#distribution

Related prompts

Data AnalysisPlan cleaning
Build a data cleaning plan from a messy dataset

You are a meticulous data analyst. Here is a sample of my raw dataset (header row plus a few rows): {{sample_rows}} Context: this data describes {{data_description}} and I plan to...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#cleaning#data-quality#planning
View
Data AnalysisWrite cleaning code
Generate pandas code to clean a dataset

You are a senior Python data engineer. Write clean, reproducible pandas code to clean my dataset. Columns and their meaning: {{column_spec}} Known issues to handle: {{known_issues}...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#pandas#python#cleaning
View
Data AnalysisPlan EDA
Run an exploratory data analysis walkthrough

Act as a data analyst guiding me through exploratory data analysis. Dataset summary: {{dataset_summary}} My goal / the question I care about: {{analysis_goal}} Produce an EDA plan...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#eda#exploration#planning
View
Data AnalysisWrite SQL
Write a SQL query from a plain-English question

You are a SQL expert writing for a {{sql_dialect}} database. Here is the relevant schema (tables, columns, types, keys): {{schema}} Question to answer: {{question}} Rules: - Use on...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#sql#querying#database
View
Data AnalysisDebug SQL
Explain and debug an existing SQL query

You are a SQL reviewer. Explain and debug the query below. Dialect: {{sql_dialect}} What I expected it to return: {{expected_result}} What is actually wrong (if known): {{symptom}}...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#sql#debugging#performance
View
Data AnalysisTune SQL
Optimize a slow SQL query

You are a database performance specialist for {{sql_dialect}}. Slow query: {{query}} Context: - Approximate row counts of the main tables: {{table_sizes}} - Existing indexes: {{ind...

Claude Opus 4.xGPT-5Gemini 2.5 Pro
#sql#performance#optimization
View

0 Comments

Sign in to post

Loading discussion...