There Are Ten Working Formats, Not Two

Discussions of hackathons collapse into a binary — open innovation versus problem-statement driven — and the conversation gets stuck. The actual ecosystem contains ten working format archetypes, each with distinct rules of engagement and failure modes.

GrowingPrinciple 5 · Format taxonomyLast updated 2026-05-03

There are ten working hackathon formats, not two. Discussions of hackathons collapse with discouraging regularity into a binary — open innovation versus problem-statement driven — and the conversation gets stuck there before the work that should follow, namely choosing the format that fits what the event is actually trying to do, ever begins.

The two-format collapse is not just imprecise. It is a category error that hides real distinctions between events that look superficially similar but operate by entirely different rules of engagement. A Kaggle-style competition where every team optimizes the same metric and the leaderboard is the rubric is not the same kind of event as NASA Space Apps with its twenty-to-thirty Challenge Statements judged across five criteria, even though both could be loosely described as "problem-statement driven." A weekend game jam where five hundred independent solo developers ship in forty-eight hours and peer-judge the results is not the same kind of event as ETHGlobal, where teams build against partner-bounty rubrics judged by sponsor representatives, even though both could be loosely described as "open innovation." The 2022 academic survey of hackathon research and practice by Nolte and colleagues names the consequence in field-level terms: "repeating poor or, in worst cases, even harmful practices, leading to sub-optimal experiences." Bad format-fit is one of the structural causes, and the field's current literature does not name the distinctions clearly enough for organizers to avoid it.

The first archetype is the Single-Problem Competition. Every team solves for the same thing; the rubric is the problem; comparison is direct because the variance the rubric has to absorb is small. Kaggle competitions and DREAM Challenges are the canonical examples, with their leaderboard architecture and machine-evaluable metrics. The format's strength is that the apples-to-apples question central to fair-judging never arises, because there are no oranges. The cost is that creativity narrows to fit the metric the problem chose, and participants whose strengths sit outside that metric find the format unkind. The datathon is the sub-archetype that follows the same logic — Kaggle Days, the BMJ Datathon, university datathons like UM Datathon — and travels alongside Single-Problem Competitions because the mechanics are identical: single dataset, single objective metric, leaderboard-driven comparison.

The second through fourth archetypes are variations on a different idea: the rubric is not the problem, the rubric is the architecture. The Themed Multi-Track Open Innovation archetype runs explicit tracks with separate rubrics and prize pools, navigating cross-domain comparison by refusing to make the comparison at all. NASA Space Apps publishes twenty to thirty distinct Challenge Statements per year and awards in ten named categories, with judges assigned to challenges they have domain context for. HackDuke's hackduke-code-for-good organizes around four social-good tracks (Education, Energy and Environment, Inequality, Health) with separate winners per track, and Random Hacks of Kindness operates the same architecture with civic-tech themes.

The Government Civic Challenge takes the multi-track logic and pushes it to a national or institutional scale, with each track owned by a real government stakeholder. Smart India Hackathon publishes more than a thousand problem statements per year, each owned by a specific Ministry or Public Sector Undertaking, each with a defined theme bucket — Smart Automation, MedTech, Disaster Management, Heritage, Renewable Energy, Blockchain, Tourism. Civic Hack DC operates the model at city scale, with an internal review board that vets submitted problem statements before publication. The format binds well-curated problem statements (see problem-statements) to government willingness to engage with the resulting solutions after the event, and that institutional follow-through is precisely what makes the archetype distinct from the Themed Multi-Track form it superficially resembles.

The Sponsor-Bounty Federation is the third variation. Instead of organizer-issued tracks, the event is structured around sponsor-funded bounties, each judged by the sponsor against narrow criteria. ETHGlobal is the canonical example, with partner bounties as parallel evaluation tracks and the convention that one project may apply to no more than three bounties. ETHDenver and ETHBoston operate the same model. The format works when sponsor agendas align with what builders actually want to build; it fails when sponsor capture turns the event into a vendor showcase and participants find themselves optimizing for whichever partner-prize happens to match what they were going to build anyway, regardless of whether that bounty actually serves the event's audience.

The fifth archetype is the University Season Hackathon — the form most people picture when they hear the word "hackathon," and the form the MLH-supported student league has standardized over the past decade. HackMIT, PennApps, TreeHacks, Calhacks, MHacks, HackPrinceton: each is an annual flagship event drawing hundreds to thousands of student participants, recruiter-driven, judged in science-fair-judging mode, and explicitly novice-friendly. The format's appeal is that it serves a clear audience (students learning the craft) with a clear value proposition (a weekend of learning, building, and recruiter visibility). The format's failure modes — demo-bias, hardware-skew bias, the over-reliance on Devpost's tooling — are well-documented and structurally bound to the format's strengths rather than separable from them.

The sixth and seventh archetypes form a deliberate pairing. The Internal Corporate Hackathon — Atlassian ShipIt, the Microsoft Hackathon, Cognizant Vibe Coding — is KPI-driven, anti-ringer because every participant is on payroll, and oriented around employee development as much as around output. Cognizant's August 2025 internal hackathon set a Guinness World Record with more than fifty thousand employees producing more than thirty thousand prototypes in ten days, a scale unimaginable until AI tooling caught up (see ai-era). The format's failure modes are theatre — events run because the company wants to be seen running a hackathon — and manager capture, where outcomes are shaped by middle- management agendas rather than by the work itself.

The Platform Ecosystem Challenge is the external mirror of the Internal Corporate. Both formats are corporate-strategic; both are KPI-driven; both run at long durations with mentorship rounds rather than as single-weekend sprints. The difference is who they cultivate. Where the Internal Corporate format develops employees on payroll, the Platform Ecosystem Challenge develops external developers in the vendor's platform ecosystem, and platform usage is embedded in the central rubric rather than running parallel to it as a sponsor bounty would. Google Solution Challenge is the canonical example, with its fifty-point rubric splitting twenty-five points for Impact and twenty-five points for Technology and program reach across more than 110 countries. microsoft-imagine-cup is the long-running historical anchor, predating the modern AI era by more than two decades. Apple Swift Student Challenge runs the same logic at smaller scale within Apple's developer ecosystem. The format's failure modes are vendor lock-in, rubric narrowness when the platform criterion crowds out everything else, and the geographic disparity that becomes visible at global scale when participants in one region have substantially better access to mentorship, sponsorship, or platform tooling than participants in another.

The eighth and ninth archetypes sit adjacent to the competitive hackathon proper. The code-sprint — OpenBSD has run them since 1999 — is non-competitive: a single open-source codebase, a small group of contributors gathered in person, no winners and no rubric, just concentrated work over a weekend or week. Many open-source projects use the format alongside their conferences. It is included in the taxonomy because the word "hackathon" covers it culturally and historically, even though the competitive judging machinery that preoccupies the rest of this site does not apply to it at all. Code sprints are valuable precisely because they remove the ranking-against-peers dynamic and let contributors focus on the work.

The Game Jam — Global Game Jam, Ludum Dare, GMTK Jam — pairs a theme with a forty-eight-hour build window and lets peer judging dominate, sometimes supplemented by jury voting. The format's strength is that the constraint structure — same theme, same time budget, same delivery medium — produces apples-to-apples comparison naturally, in a way the open-innovation formats have to engineer through tracks or rubrics. The failure modes are genre conformity (the same theme tends to produce visually similar games, narrowing creative range) and peer-judging gaming (alliance voting and reciprocal-rating dynamics that drift the result away from quality and toward social position).

The tenth archetype is the most structurally distinctive of the ten, and the only one in which the team is not the unit of competition. The Blind Replication Sprint — also called the Convergence Hackathon — treats multiple independent teams as the unit of replication: each team works in isolation on a shared dataset, using methodologically distinct approaches by design, and the success criterion is convergence of results rather than ranking against results. The Event Horizon Telescope's October 2017 imaging exercise at the Black Hole Initiative, labeled directly on screen as the EHT Image Hackathon in Peter Galison's documentary record of the collaboration, and the seven-week sprint that produced the M87* image in 2018 are the canonical worked examples. Four teams, paired across two methodological families — CLEAN-based pipelines for the Americas and Global teams, Regularized Maximum Likelihood pipelines for the East Asia and Cross Atlantic teams — worked under embargo before converging on a result that agreed across teams to better than 95% pixel-to-pixel correlation. The Critical Assessment of protein Structure Prediction (CASP), running biennially since 1994, is the institutionalized version, with experimental protein structures held out from competitors and scored against the truth only after submission — the protocol AlphaFold dominated at CASP14 in 2020. The parallel ATLAS and CMS analyses leading to the July 2012 Higgs boson announcement are the largest-scale version, with two independent collaborations of roughly three thousand physicists each operating on the same beam through different detectors. LIGO's blind injection methodology operates the same logic for gravitational wave detection. The format earns its own page in the Manifesto, integrity-through-convergence, because the integrity mechanic — convergence as proof when no ground truth exists on the day of judgment — is a structural innovation rather than just another rubric variation, and its implications reach beyond Big Science into any hackathon whose outputs the world cannot independently verify on the day of judging.

Three sub-archetypes cut across the ten and earn mention without earning their own slot. The Hardware Kit-Standardized event — MIT Reality Hack, qualifying events for the Hackaday Prize — gives all teams identical hardware kits, eliminating the budget asymmetry that produces hardware-skew bias in mixed events; the format is most often a variant of Themed Multi-Track or University Season rather than its own thing. The Datathon, mentioned above alongside the Single-Problem Competition, is structurally identical to that archetype but cohort- different in audience and tooling, with data scientists rather than software engineers as the primary participants. The Founder-Track event — Y Combinator AI Startup Schools, Startup Weekend, AngelHack Global Series, TechCrunch Disrupt — is the format whose ringer-tolerance is the point rather than the bug. Founder-Track events advertise themselves as recruitment and pipeline events; funded teams showing up is what they exist for. The structural problem covered in no-ringers-without-disclosure arises only when an event mislabels itself, presenting a Founder-Track format as if it were a student fair-fight, or vice versa.

What the ten-archetype taxonomy gives organizers is a vocabulary for honest event design. An organizer who can name which archetype they are running can choose rubrics, integrity mechanics, sponsor structures, and communication tone that fit the format rather than fighting it. An organizer who cannot — who is operating from the two-format binary — ends up running a Themed Multi-Track event with University Season conventions, or a Sponsor-Bounty Federation with Founder-Track integrity discipline, and the participant experience suffers in ways the organizer cannot diagnose afterward because the conceptual frame for diagnosis was never in place. The taxonomy is also what makes fair-judging operationalizable: the apples-to- apples principle's three architectures (single-problem, explicit tracks, abstracted uniform rubric) map onto specific archetypes in this list, and choosing the right judging architecture follows from choosing the right format. Without the taxonomy, judging architecture decisions become ad hoc; with it, they become principled.

The taxonomy is not closed. Hackathon culture continues to develop, and new archetypes will earn their own pages as they prove distinctive enough to warrant separate treatment. Subsequent pages in The Book treat each of the ten in detail, with worked examples, characteristic judging mechanics, and named failure modes. The case studies nasa-space-apps, ethglobal, smart-india-hackathon, google-solution-challenge, cognizant-vibe-coding, and event-horizon-telescope work the most distinctive examples in depth. The judging principle fair-judging shows how the three valid judging architectures map onto specific archetypes; the framing principle the-frame shows why every archetype except the Code Sprint requires a frame to function; and the integrity principle integrity-through-convergence shows what fair judging looks like in the one archetype where conventional rubrics do not apply at all.