Paper draft
Readable HTML version of the generated Markdown draft.
Claim ledger
5 claims traced to source paper IDs and evidence rows.
Audit status
pass 8 supported claims, 5 filled evidence rows.
Release notes
- Generated at 2026-05-08T02:31:06+00:00 from local artifacts in
cs-ai/research-harnesses. - The site intentionally excludes the localhost tmux control UI; only static research artifacts are public.
- Rows marked abstract-derived still require full-text audit before submission-level claims.
Evidence-backed papers
| Paper | Claim | Notes |
|---|---|---|
2605.03042v1 | Long-horizon single-agent research is unreliable by default; the central failure mode is plausible unsupported success, mitigated through persistent state, modular execution, and independent assurance via cross-family executor/reviewer separation. | Full text read from arXiv PDF. Integration implication: our harness should keep evidence_matrix/claim ledger as first-class artifacts, add research-harnesses taxonomy, and keep reviewer-independent audits as hard invariants. |
2504.08066v1 | AI Scientist-v2 reports an end-to-end agentic system that formulates hypotheses, designs and executes experiments, analyzes and visualizes data, and writes manuscripts; compared with v1 it removes reliance on human-authored code templates and uses progressive agentic tree search. | Abstract-derived evidence row; needs full-text audit before final submission. |
2603.28589v1 | Clinical autonomous research needs domain-specific evidence grounding; Medical AI Scientist transforms surveyed literature into actionable evidence and uses clinician-engineer co-reasoning to improve traceability of generated ideas. | Abstract-derived evidence row; full text required before citing numerical comparisons. |
2507.23276v2 | Current AI Scientist systems have visible achievements, but the field still needs clarity on bottlenecks and critical components before scientific agents can produce ground-breaking discoveries that solve grand challenges. | Abstract-derived evidence row; needs full text for taxonomy details. |
2511.04583v4 | Jr. AI Scientist follows a novice-student-like workflow from a baseline paper: analyze limitations, formulate hypotheses, iterate experiments until improvements, and write a result paper; the work also reports risks and limitations of current AI Scientist systems. | Abstract-derived evidence row; useful for risk and workflow comparison. |