← Blog
Audit Findings·2026-04-28·9 mindraft · not yet final

Five Amplitude misconfigurations that quietly kill data quality

The defaults look fine. The data isn't. The five most common misconfigurations Adasight finds on every audit, and how to fix them.

Most Amplitude setups look fine on the surface. Events are flowing. Charts are rendering. Dashboards have data. Then a serious analytical question gets asked, and the answer turns out to be wrong, and the team spends three weeks figuring out why.

After auditing dozens of Amplitude setups for clients, the same five misconfigurations show up again and again. None of them fail loudly. All of them silently corrupt the analysis layered on top.

1. Identification rate well below 90%

The default Amplitude SDK sends events with a device_id (the anonymous browser/app identifier). When the user logs in, the SDK is supposed to receive an identify call that links device_id to a stable user_id. From that point on, the user's history is unified.

In practice, many implementations skip or delay the identify call. We see identification rates as low as 30–40% on production projects. That means the majority of events cannot be tied to a specific user across sessions, retention is undercounted, and any user-property segmentation is unreliable.

The fix: Audit the points in the auth flow where the user is known but identify has not been called. The most common gaps are SSO callbacks, mobile auto-login, and embedded experiences (Slack, Notion, etc.). Add the identify call at every such point with the canonical user_id.

2. Top events dominated by auto-collected and generic activity

A healthy event taxonomy has business-meaningful events in the top 10 by volume. feature_used, workflow_completed, subscription_started — those are the events the team actually analyses.

A broken taxonomy has the top 10 dominated by Page Viewed, [Amplitude] Active Session, Element Clicked, and other auto-collected events. The signal is buried in the noise.

This usually happens when teams enable Amplitude's auto-capture features for convenience, then never instrument the meaningful business actions on top. The platform is doing its job; the team has not yet done theirs.

The fix: Define the 5–10 events that matter most for the business. Instrument them explicitly. Keep auto-capture for diagnostic purposes but do not let it dominate the analysis surface. The auditor flags this as ds-event-volume-meaningful-share.

3. Naming convention drift

The original tracking plan probably said snake_case. After 12 months and three engineering teams, the production project has signup_started, SignupStarted, signup-started, and Signup Started all firing for the same action.

Amplitude treats these as distinct events. Funnel analysis breaks. Cohort definitions miss users. Retention undercounts. The team spends time reconciling them in dashboards instead of analysing them.

The fix: Run the auditor's events-naming-snake-case check. For each violator, decide whether to migrate users to the canonical name or accept the drift. Then enforce naming via Amplitude Govern's required-pattern setting and via PR review on tracking changes.

4. Session Replay shipped with default masking only

Amplitude Session Replay is a recent addition and a powerful one. The default privacy configuration masks form inputs — a sensible baseline. It does not mask any of the following:

  • Account balances rendered in <span> elements
  • Internal account IDs displayed in tables
  • PII shown in tooltips, modals, or notification banners
  • Anything in plain text content

For B2B or finance-adjacent products, this is a real compliance risk. Anyone with replay access can see customer data unredacted.

The fix: Add custom blur rules for elements that contain sensitive data. Use blockSelector for elements that should not appear in replays at all, and maskSelector for elements whose text should be redacted but whose presence should remain. Tag sensitive elements with data-sensitive or data-pii attributes in your codebase to make this a one-time discipline rather than an ongoing audit.

amplitude.add(sessionReplayPlugin({
  sampleRate: 0.1,
  privacyConfig: {
    blockSelector: ['[data-sensitive]', '.balance'],
    maskSelector: ['[data-pii]', '.account-number'],
  },
}));

5. Govern unused or partial

Amplitude Govern is the product's governance layer — it lets you categorise events, mark some as hidden or blocked, define required properties, and enforce naming patterns.

In every audit we run, Govern is either completely empty or partially used. "Partially used" looks like: 3 categories defined, 30% of events categorised, no required properties, no blocking. The feature is on but it is not enforcing anything.

When Govern is empty, the taxonomy drifts because there is no enforcement layer. When it is partial, the team has the false sense that governance is in place when in fact only 30% of the surface is governed.

The fix: Pick one. Either commit to governing the taxonomy via Govern (categorise everything, set required properties, enforce naming), or pick a different governance approach (e.g., Avo) and disable Govern entirely. Half-governance is worse than no governance — it lulls the team into thinking the work has been done.

What to do next

If two or more of these sound familiar, the cost of leaving them is bigger than the cost of fixing them. Start by running an audit on your project — the auditor surfaces all five within ~60 seconds, with the specific findings and recommended fixes for your setup.

If the audit shows a deep set of issues, Adasight's data and analytics team runs full audits manually for clients with the more complex stacks. Book a 30-minute consult to see if it is a fit.

FAQ

Are these issues common across all Amplitude tiers? Yes. We see the same patterns on Starter accounts and on Enterprise accounts with mature data teams. The misconfigurations are not capability issues — they are governance and review-cadence issues.

Can I fix these myself? Most of them, yes. The auditor surfaces what is wrong; the fixes are usually straightforward. The harder part is changing the team's review habits so the issues do not return six months later.

What if my identification rate is genuinely low because we have a lot of anonymous traffic? A low identification rate is fine for top-of-funnel marketing analyses (anonymous users browsing). It is a problem for product analyses (user behaviour over time, retention, cohorts). Segment your charts accordingly — anonymous traffic charts versus identified user charts — and stop treating the two as one stream.

Does fixing these require re-instrumentation? Sometimes. The naming-drift fix can require migration. The identification fix can require auth-flow changes. The Session Replay masking fix is a config update only. Most are smaller than they sound.

Get the post when it lands

Plus the audit when access opens.