- Jobs
- Collective Intelligence Project
- Lead Product Engineer
Lead Product Engineer
AI Tools & Frameworks
Tech Stack
Agent Workflow
12-month outlook anticipates less line-by-line coding and more AI agent orchestration, architectural decisions, and output review from increasingly capable coding tools.
About the Role
The Collective Intelligence Project builds infrastructure enabling global input into AI system development and governance. The organization combines large-scale deliberation, participatory evaluation, and institutional partnerships. It operates as a small team supported by major foundations including Google.org, Omidyar Network, and Future of Life Foundation, working with AI labs and governments.
About the Role: The position involves building and maintaining full-stack platforms with complex data, visualizations, and user experiences. The core challenge centers on articulating complicated data for mainstream audiences including journalists, academics, and engineers.
Primary focus will be continuing development of Weval (weval.org), an evaluation platform used by AI labs and governments to assess frontier models on questions automated benchmarks cannot address — such as mental health crisis handling, accurate legal advice delivery in Indian languages, and political bias detection.
Secondary work includes Global Dialogues (70+ countries gathering public input on AI), Digital Twin evaluations, and democratic AI governance tool deployments.
Key Responsibilities:
Weval Development (~60%):
- Build core platform features: evaluation authoring tools, leaderboards, data pipelines for collecting and analyzing human judgments
- Develop APIs and integrations enabling labs (Anthropic, OpenAI, Cohere) and governments to run Weval evaluations
- Design and implement rich data visualizations and interactive interfaces articulating complex evaluation data for non-technical audiences, policymakers, and journalists
- Create tools enabling non-technical users to design and deploy evaluations
- Own key architectural decisions as the platform scales
Supporting Other CIP Projects (~30%):
- Global Dialogues: Analyze and visualize data from 10,000+ participants across 70+ countries
- Digital Twins: Develop evaluation infrastructure testing AI agent accuracy in representing diverse groups' values
- New experiments: Prototype tooling for partners
Required Qualifications:
- 3-5 years software engineering experience with strong frontend focus
- Significant experience with NextJS, React, and TypeScript
- Shipped products that users find valuable
- Strong product sensibility regarding UX, design quality, and user-focused building
- Genuine facility with AI tools (like Claude, Cursor, or similar) and daily workflow integration
- Comfortable working independently with pragmatic technical decision-making and rapid execution
- Genuine enthusiasm for CIP's democratic AI infrastructure mission
- Preference for applicants available during Pacific or Eastern time zones
Nice-to-Have:
- Experience with AI evaluation platforms, survey tools, research infrastructure, or data collection systems
- Experience with Supabase/Postgres, Vercel/Netlify
- Background in mission-driven organizations, civic tech, research, or academic settings
- Open source contributions, technical writing, or community building
12-Month Outlook: The role anticipates less line-by-line coding and more AI agent orchestration, architectural decisions, and output review from increasingly capable coding tools. Value shifts toward system design judgment, quality standards, and managing parallel workstreams with AI execution.
Compensation: $150,000 + health/dental/vision insurance, 403(b), generous PTO Flexible hours, life accommodation, output-focused culture; hybrid in-office/remote on Pacific time.