Hello Dojo
Outcome
- 4
- Products shipped Customer App, Driver App, Fleet Portal, Vendor Portal
- 6 mo
- Discovery to TestFlight End-to-end production build
- 30+
- Shared components One system across all 4 products
- Beta
- Launch status TestFlight private beta
On this page
Select a section
One-line impact
Transformed a vague "Uber with AI" concept into a production-ready, multi-sided hospitality platform centered on an agentic concierge that unified rideshare, dining, and premium experiences across Ibiza's fragmented booking ecosystem.
Snapshot
- Role: Product Design Engineer & Product Manager
- Timeline: November 2025 – May 2026 (6 months)
- Company: helloDojo
- Platform: iOS (Customer App, Driver App), Web (Fleet Manager Portal, Vendor Portal, Landing Page)
- Team: 2 founders (CTO + CEO), 1 designer (me), 1 subcontracted design systems lead, 1 QA
- Scope: End-to-end product discovery, UX architecture, design system definition, frontend implementation, branding, and launch readiness across 4 interconnected products
- Status: TestFlight (private beta)
Customer App
iOS — book a ride, table, yacht or VIP
Driver App
iOS — accept requests, navigate, complete
Fleet Portal
Web — vehicles, drivers, compliance
Vendor Portal
Web — availability, bookings, demand
Executive summary
helloDojo is a multi-sided platform that solves a real operational problem in Ibiza: the fragmented, relationship-dependent booking ecosystem where taxis, restaurants, yachts, and experiences require separate calls, WhatsApp connections, and insider knowledge. My role was to transform a loose product vision ("Uber with AI") into a cohesive, production-ready system. I owned product discovery, UX architecture, design system leadership, frontend implementation, and brand positioning. The core insight—making an AI concierge named Dojo the primary interface rather than replicating Uber's transactional design—became the foundation for the entire experience. Critically, I designed three distinct interaction modes (Voice Mode, Chat Mode, and UI Flow) that let users choose how to interact with Dojo based on their context and preference, all while keeping the agentic experience central. In 6 months, I shipped a fully integrated platform across iOS and web with reusable component systems, production-ready code, and clear operational logic, despite significant scope pressures.
Voice Mode
Hold-to-talk, premium hands-free. Dojo speaks back and shows option cards beneath the avatar — same conversation continues whether you tap or speak.
Chat Mode
Same flow as voice, written. For loud clubs, hearing accessibility, or just preference. Option cards drop in below each Dojo reply.
UI Flow
For users who want to browse all 50 restaurants before deciding. Stepped, filterable, elevated visual treatment — still routes through the same booking engine.
Context
Ibiza's hospitality and nightlife ecosystem operates through fragmentation and personal connections. Tourists and residents need taxis, dinner reservations, yacht bookings, VIP table access, and event tickets—but each service requires separate apps, phone calls, or WhatsApp conversations with concierges who gatekeep availability and pricing. The friction is real: no unified booking surface, no transparent pricing, no integration between services, and access depends on knowing the right person.
The CEO had validated demand for rideshare specifically and saw the opportunity to build a broader platform. However, the vision was still abstract: "Uber meets AI booking." The business goal was to own the primary booking layer for premium experiences in Ibiza and capture transaction volume across multiple service categories.
My entry point was to make that vision concrete. The market opportunity was clear. The product shape was not.
Rideshare
Apps, taxi ranks, WhatsApp groups
Dining
Resy, calls, concierge favours
Yachts
Brokers, deposits, side-channel WhatsApp
Nightlife & events
VIP lists, door staff, who-you-know
Problem statement
User problem
Users in Ibiza—tourists, groups, VIP guests, nightlife customers—face repeated friction:
- Waiting 30+ minutes for a taxi because the system is informal and unpredictable
- Making multiple calls or messages to different venues to book experiences
- No unified way to discover or reserve rides, dining, yachts, events, or premium tables
- Language barriers and unfamiliar local connections make access harder
- During peak season, the ambient energy of the island (literally the sound and rhythm of nightlife) means voice communication is often impractical in clubs and venues, but they still want the speed and simplicity of voice interaction when possible
Drivers and fleet operators need operational clarity: routing, real-time demand signals, compliance tracking, and the ability to scale beyond single-operator models.
Venues (restaurants, yachts, nightclubs, beachclubs) need a structured way to receive bookings, manage availability, and understand demand patterns without managing multiple channels.
Business problem
The CEO's thesis was that Ibiza's hospitality market was ripe for consolidation. The business needed to:
- Establish a single point of entry for premium experiences (defensible position in a fragmented market)
- Capture transaction volume across rideshare, dining, accommodations, and events (multiple revenue streams)
- Build operational infrastructure that venues and drivers would adopt (network effects)
- Launch with enough credibility and UX polish to attract both users and vendors in a competitive, image-conscious market
The risk: launching half-baked or fragmented would fail to differentiate. The opportunity: own the primary booking interface in Ibiza's premium segment before competitors entered.
Product problem
How do you create a single, coherent experience that unifies radically different service categories (instant rideshare, advance bookings, discovery, payments, real-time logistics) without overwhelming the user or fragmenting into isolated modules?
The secondary problem: How do you design for an environment where voice interaction is often preferred, but users need flexibility based on context (loud clubs require typing, but users still want conversational booking flow)?
The third problem: How do you build operational UX for drivers (mobile, on the road, safety-critical) and vendors (desktop, async management) that feels connected to the consumer experience rather than bolted-on?
User problem
Tourists and locals need rides, dinners, yachts, tables — but the system runs on insider connections, scattered apps, and language-blind concierges.
Business problem
Capture transaction volume across rideshare, dining and events before competitors land. Half-baked or fragmented = failure in an image-conscious market.
Product problem
Unify radically different service categories (instant rides, advance bookings, payments, logistics) under one experience — without overwhelming or fragmenting.
My role
I owned product discovery, UX architecture, design system strategy, and frontend implementation across all customer-facing and operational products.
When I arrived, there was a backend infrastructure but no user experience, no product logic translated into flows, and no visual direction. The CTO was focused on architecture and APIs. I was responsible for everything else: understanding the ecosystem, defining user journeys, making trade-off decisions, creating a unified design language, and translating all of that into buildable, production-ready code.
Responsibilities:
- Product discovery & strategy: Researched Ibiza's hospitality ecosystem, identified user behaviors and pain points across multiple personas (tourists, drivers, venue operators), and translated the abstract "Uber with AI" concept into a concrete product model
- UX architecture: Defined user journeys across 4 distinct user types (customers, drivers, fleet managers, vendors); designed three distinct interaction modes (Voice Mode, Chat Mode, UI Flow) for customer booking; mapped edge cases, states, and service logic
- Brand & positioning: Created Dojo (the AI concierge mascota), defined visual identity, tone of voice, and the core narrative around agentic interaction
- Design system leadership: Defined components, tokens, variants, responsive behavior, and interaction states; established visual hierarchy and background elevation rules to distinguish interaction modes; led a subcontracted design systems team to build scalable, AI-readable design documentation
- Frontend implementation: Built Customer App, Driver App, Fleet Manager Portal, and Vendor Portal in React + Tailwind using AI-assisted workflows; managed design-to-code translation via Figma MCP; ensured visual and behavioral consistency across products and interaction modes
- Product decisions: Made foundational calls on feature prioritization, interaction paradigms, data integration, and scope management
Constraints
The project operated under real-world pressures that shaped every decision:
- Scope ambition vs. timeline: 6 months to deliver a production-ready multi-sided platform (Customer App, Driver App, Fleet Manager, Vendor Portal, Landing Page, design system, branding). No room for iteration cycles or post-launch pivots.
- No pre-existing research or validation: The CEO validated rideshare demand. Everything else—dining, yachts, events—was an educated guess. I had to design without user testing or competitor teardowns.
- Parallel system complexity: Building 4 interconnected products simultaneously meant decisions in one system rippled into others. No room for isolated design thinking.
- Three interaction modes simultaneously: Designing Voice Mode, Chat Mode, and traditional UI Flow all at once introduced complexity. Each mode had to feel coherent, but distinct enough to serve different contexts.
- AI-driven interaction as core UX: Designing for voice + chat + UI simultaneously introduced technical and UX unknowns. Text-to-speech latency, voice recognition accuracy, fallback UX for ambiguous commands—all had to be solved in design.
- Operational UX at scale: The Driver App had to work for drivers actively on the road; the Fleet Manager portal had to surface real-time logistics, compliance, and demand signals. High stakes for safety, clarity, and speed.
- Single design system for multiple product contexts: Apps and web, consumer and operational, mobile-first and desktop. One system had to serve all without becoming a lowest-common-denominator soup.
- Leadership changes mid-project: The CTO (technical co-founder) left during execution, shifting decision-making and technical continuity.
6-month runway
Multi-sided platform, no room for pivots
No prior validation
Only rideshare demand was confirmed
Parallel system complexity
Decisions in one product rippled into all 4
Three modes at once
Voice + Chat + UI Flow, all coherent
Novel AI-driven UX
Latency, recognition, fallback all open
Safety-critical ops
Driver app + fleet portal on the road
One DS, many contexts
Apps + web, consumer + operational
Leadership change
CTO left mid-execution
Product strategy
The work was guided by one core principle: Make Dojo—the AI concierge—the primary interface, and keep users talking to Dojo as much as possible.
Initial thinking fell into the trap of Uber/Cabify pattern: map + tap to request. But that misses what makes Ibiza different. During peak season, the island pulses with ambient energy—music, rhythm, social flow. Users are in clubs, villas, boats. Voice interaction isn't a nice-to-have; it's the natural mode. But voice isn't always practical (loud venues, hearing difficulties, preference for writing). Chat becomes the fallback. And traditional UI becomes the third path for users who prefer structure or want to browse all options.
The key insight was: All three modes should feel like talking to Dojo, not like three separate products. Whether you speak, type, or tap, you're interacting with the same agent, the same flow, the same booking logic. The mode changes; the core experience doesn't.
This meant rejecting traditional SaaS patterns:
-
Agentic as default, UI as support. Voice Mode and Chat Mode should be the primary path. Users feel like they're talking to a capable concierge. Dojo shows options as UI cards, but the conversation continues—users can tap or speak to select.
-
Preserve voice-first positioning even in UI Flow. The traditional UI Flow (for users who don't want to talk or chat) still needed to feel premium and agentic, not transactional like Uber. Different visual treatment, but same underlying logic.
-
Unified service taxonomy over service silos. Rideshare, dining, yachts, events—these aren't separate products. They're service categories accessed through a single interface, regardless of mode. Reuse flows, components, and mental models wherever possible.
-
Real-time operational transparency. Drivers need to see demand and logistics in real time. Fleet managers need to understand vehicle deployment and compliance. Venues need async booking management. Transparency reduces friction and builds trust.
-
User continuity and reduced friction. If a user books a restaurant and then needs a ride, the system should remember them, pre-fill information, and reduce re-entry friction. User sharing (detecting returning users and opening existing accounts) became a core feature.
-
Scope discipline through feature prioritization. Launch rideshare first, stabilize, gather feedback, then expand to dining and experiences. The CEO wanted everything at once. The strategy was to push back and build a defensible, validated core before expanding.
Agentic by default, UI as support
Voice and Chat are the primary path. Users feel they are talking to a capable concierge. UI cards support the conversation — they never replace it.
Voice-first DNA, even in UI Flow
The structured browse path still feels premium and agentic — same Dojo voice, same booking logic — just a different visual treatment.
Unified service taxonomy
Rideshare, dining, yachts and events are not separate products. They are service categories accessed through one interface, one mental model.
Real-time operational transparency
Drivers see demand. Fleet managers see deployment. Venues see availability. Transparency reduces friction across all four sides of the platform.
User continuity and reduced friction
Book a restaurant, then need a ride — the system remembers you, pre-fills, opens the right account. Switching costs measured in seconds.
Scope discipline through prioritization
Rideshare first, then expansion. Push back on big-bang launches; build a validated core before stretching across categories.
Key decisions
| Decision | Why it mattered | Trade-off |
|---|---|---|
| Three interaction modes (Voice, Chat, UI Flow), not voice-only | Users in loud clubs can't use voice. Users with hearing loss need alternatives. Users who want to browse all options need UI. Supporting all three modes makes the product usable across contexts while keeping Dojo central. | Increased complexity: three distinct UX flows, three state machines, three visual treatments, but same underlying booking logic. More edge cases to handle. |
| Voice Mode and Chat Mode show UI elements (cards, buttons, selectors), not pure conversation | Pure conversation without visual options creates ambiguity. Showing options (restaurants, pricing, availability) as UI cards while maintaining conversational flow gives users clarity and control. | Risk of splitting user attention between voice/text and UI. Required careful information hierarchy and visual clarity. |
| UI Flow has elevated background and distinct visual treatment | Users in nightlife need to identify interaction modes quickly. Dark background for Voice/Chat, lighter elevation for UI Flow signals a shift from agentic to structured browsing. Cognitive clarity reduces friction. | Breaks visual cohesion if not carefully designed. Required thought about when users transition and how to make the shift feel natural, not jarring. |
| Keep users in Voice/Chat mode; don't funnel to UI Flow | The differentiator is agentic booking. Once a user starts talking to Dojo, keep them talking. Don't transition them to traditional UI halfway through. | Loses some flexibility. Users who want to browse all 50 restaurants upfront can't easily do that in Voice/Chat Mode; they have to switch to UI Flow. |
| Dojo as primary interface, not Uber-style map | Users in Ibiza's environment prefer voice/chat. Agentic interaction feels premium and differentiates from commodity ride-hailing. Dojo becomes the brand and the UX. | Lost immediate clarity of map-based navigation. Requires robust fallback UI and clear affordances for users unfamiliar with voice-first apps. |
| Unified booking flow for restaurants, yachts, beachclubs, nightclubs | Reduces component duplication, simplifies mental model for users, makes the system maintainable. Party size, date, time are universal. Build once, apply everywhere. | Loses service-specific nuance. A yacht booking is not the same as a nightclub VIP table, but forcing them into one pattern obscures those differences. Required careful naming and context. |
| User sharing and account continuity | Reduces friction for repeat users. System detects existing account and opens it automatically. Builds habit and engagement. | Required backend coordination and careful privacy/security handling. Risk of confusing users if the system auto-opens wrong account. |
| Rideshare-first, then expansion | Validates the core value proposition before expanding scope. Rideshare is the easiest to operationalize and has the clearest value for users. | Means dining, yachts, events launch later and with less market research. Creates pressure to ship everything simultaneously (which happened anyway due to CEO pressure). |
| Single offer-creation system for all venue types | CTO wanted one unified backend for offers (restaurants, venues, yachts, nightclubs). One system, one API. | Loses service-specific logic. A restaurant's Happy Hour offer is not the same as a nightclub's VIP table pricing. The system became a lowest-common-denominator model that required workarounds. |
Process
Option A: Uber-style map interface with AI booking layer
Replicate Uber's map-based, tap-to-request model. Users see location, available drivers/services, tap to request. AI assists via natural language, but map remains primary interface.
Pros: Familiar mental model; clear visual affordances for logistics; fast request flow.
Cons: Doesn't differentiate from competitors; ignores Ibiza's environmental context; splits interaction paradigm; treats rideshare and experiences separately.
Option B: Form-based booking portal with AI assistant
Traditional SaaS model. Users navigate UI, select services via menus, fill forms. AI assists via chat.
Pros: Straightforward to build; AI feels like helpful layer; familiar for business users.
Cons: Defeats premium concierge positioning; not different from Booking.com or Resy; form-filling is opposite of "speak and get it done"; AI feels bolted-on.
Option C: Agentic interface with Dojo as primary, voice + chat + UI as three distinct modes
Dojo is primary interface across three modes. Voice Mode: Users speak to Dojo, Dojo responds with audio and shows option cards. Chat Mode: Users type, Dojo responds in text with option cards. UI Flow: Structured booking (like Uber/Resy) with steppers and filters, no conversational element. All three converge on same booking confirmation.
Pros: Highly differentiated; matches Ibiza's environment; three modes handle real-world friction; premium feel; scales across multiple service categories; voice/chat curate options (reduces choice paralysis); UI Flow shows all options (for browsers).
Cons: Most complex to design/build; requires clear mental model; three visual treatments must feel cohesive; dependent on API quality.
Final direction
Option C won. Once framed—Ibiza's environment, user preferences, premium positioning—the agentic model became obvious. The CEO and CTO aligned immediately. Dojo wasn't just an AI feature; it was the product.
The design challenge: How to make three distinct modes feel like one product? Answer: Architecture and visual hierarchy. Voice/Chat share dark background and Dojo-centric layout (signals "conversation"). UI Flow uses elevated visual treatment (signals "browse and select"). All three share components, tokens, and booking logic, so the product feels coherent.
Option A — Uber clone
Map-first, tap-to-request, AI layered on top
Map remains primary
Tap to request
AI assists via NLP
Zero differentiation
Ignores Ibiza's environment
Option B — SaaS portal
Forms + menus, AI assistant on the side
Browse and form-fill
AI as helpful chat layer
Defeats concierge positioning
Just another Resy clone
Option C — Agentic, three modes
Dojo as primary across Voice / Chat / UI
Voice — premium hands-free
Chat — accessible fallback
UI Flow — structured browse
One Dojo, one booking engine
Three doors, same room
UX architecture
The platform served 4 distinct user types with overlapping but separate journeys. The core insight was identifying what was universal (party size, date, time, availability, confirmation) and what was service-specific (location for rideshare, menu preferences for dining, guest count for yachts, etc.).
Core journeys:
-
Customer: Rideshare request (Voice Mode) → Open app → See Dojo avatar and "How can I help you today?" → Tap Dojo or say "Dojo" → Hold to talk: "I need a taxi to Pacha" → Dojo processes via NLP, confirms pickup and destination → Displays driver assignment and real-time tracking → Ride begins → Completion and rating
-
Customer: Rideshare request (Chat Mode) → Open app → Tap chat icon → Type "I need a taxi to Pacha" → Dojo responds in text, shows driver options or direct assignment → User can tap to confirm or type follow-up questions → Real-time tracking → Completion and rating
-
Customer: Rideshare request (UI Flow) → Open app → Tap "Use UI" or access from menu → Select "Get a ride" → Enter pickup location (map or address) → Enter destination → See available drivers with ETA and price → Select driver → Confirm → Real-time tracking → Completion and rating
-
Customer: Experience booking (Voice Mode) → Open app → Tap Dojo → "I want dinner for 4 tomorrow at 10 PM" (voice or text) → Dojo asks clarifying questions if needed ("Cuisine preference?") → Dojo displays available venues with pricing and details as UI cards → User taps to see full venue details or says "first option" → Confirmation → Booking complete
-
Customer: Experience booking (Chat Mode) → Open app → Tap chat input box → Type "dinner for 4 tomorrow 10 PM" → Dojo responds, shows venue options as cards → User taps or types to select → Booking complete
-
Customer: Experience booking (UI Flow) → Open app → Select service type (Restaurant / Yacht / Beachclub / Nightclub) → Enter party size, date, time → See all available venues with filters (cuisine, price, distance, rating) → Tap venue for details → Confirm booking → Payment
-
Driver: Accept and complete rides → Driver app shows incoming ride requests in real time → Tap to accept → Navigation to pickup → Pickup confirmation → Navigation to destination → Completion and rating
-
Vendor: Manage availability and bookings → Vendor portal login → Overview dashboard → Manage availability calendar → View incoming booking requests → Accept/decline bookings → Track booked slots → View demand heatmap (where requests are coming from)
-
Fleet manager: Monitor operations → Fleet portal login → Real-time map view of all vehicles and their status → Heatmap of ride requests across the island → Vehicle management (add/remove vehicles) → Driver assignment (2-3 drivers per vehicle for rotation) → Compliance tracking (document expiration dates for drivers and vehicles)
Important states:
- Idle/Listening (Voice Mode): Dojo avatar floating, waiting for user input; "How can I help you today?" prompt visible
- Processing (Voice Mode/Chat Mode): Dojo processing user request, asking clarifying questions; avatar reduced in size, moved up; options appear below
- Options display (Voice Mode/Chat Mode): Dojo showing available options via cards, buttons, or selectors below the avatar; user can tap or speak
- Confirming (all modes): User confirming selection; system proceeding with booking
- In-transit (rideshare): Real-time tracking of driver, ETA, live location
- Awaiting fulfillment (experience): Booking confirmed; awaiting venue acceptance
- Accepted/Completed: Transaction complete; rating/feedback flow
- Error/Fallback (Voice Mode): Voice failed or unclear; system prompts user to repeat, tap options, or switch to Chat/UI Flow
- Empty state (no results): No available drivers/venues; Dojo suggests alternatives or queuing
- Compliance: Fleet manager viewing document expiration and renewal status
Dojo
One agent, four sides of the platform
Customer
Voice — speak and book
Chat — type with option cards
UI Flow — browse and filter
Driver
Incoming ride request
Pickup navigation
Complete + rate
Vendor
Availability calendar
Accept / decline bookings
Demand heatmap
Fleet manager
Real-time fleet view
Driver assignment
Compliance tracking
Final solution
Dojo Interaction Modes
The core innovation was designing three distinct interaction modes that all connect to the same underlying booking engine. Users choose their mode based on context and preference.
Mode 1: Voice Mode
Design: Full-screen Dojo avatar in center with gradient background (red to blue), prominent "Hold to Talk" button below. When user holds and speaks, Dojo listens, processes, and responds with text-to-speech. Dojo shows visual feedback (pulsing animation during listening, thinking state while processing). When Dojo displays options, the avatar shrinks and moves up, and option cards appear below. User can tap options or continue speaking.
Use case: Premium, hands-free interaction. Users who want concierge experience and are in environments where speaking is acceptable (car, street, quieter venues).
Why it works: Voice-first positioning. Premium feel. Minimal friction. Dojo feels alive and responsive.
Mode 2: Chat Mode
Design: Conversational chat interface at bottom of screen with text input box ("Ask me anything..."). Chat thread above shows Dojo's text responses. Below each message, Dojo shows option cards (venue cards, buttons, date/time selectors). User can type questions or tap options. Full-screen dark background (same as Voice Mode).
Use case: Users in loud environments who can't use voice but want conversational experience. Users who prefer typing. Users with hearing loss.
Why it works: Same conversational flow as voice, but text-based. Option cards reduce ambiguity. Users stay in chat mode throughout the booking; no jumping between UIs.
Mode 3: UI Flow
Design: Structured, stepped booking experience. Home screen shows service categories (Rideshare, Restaurant, Yacht, Beachclub, Nightclub). Tap a category, enter details (party size, date, time). Results page shows all available options with filters (price, distance, rating, cuisine). Tap option for details. Confirm booking. Elevated visual treatment (light gray/white backgrounds, more breathing room) to distinguish from Voice/Chat dark experience. No Dojo avatar; UI is the interface.
Use case: Users who want to browse and compare all options. Users unfamiliar with voice interaction. Users who want structured, predictable flow (like Uber or Resy).
Why it works: Familiar mental model for people who've used Uber or Booking.com. Transparent pricing and options. Filters reduce choice paralysis. Clear step-by-step flow.
How the three modes connect
- Home screen: Dojo avatar, quick-action buttons (Book Nightclub, Book a Yacht, Get a Ride, Reserve Restaurant, Reserve Beach Club), chat input box, and "+" menu showing all options with ability to customize home.
- Mode selection: User taps Dojo (Voice), taps chat input (Chat), or taps "+" menu and "Use UI" (UI Flow).
- Unified payment: All three modes converge on the same payment confirmation. User provides payment info once; system remembers it.
- Booking confirmation: Regardless of mode, user receives same confirmation with venue details, cancellation terms, and next steps.
- User memory: If user books a restaurant in Chat Mode, then opens the app later and wants a ride, the system recognizes them (user sharing, no re-auth needed).
Flow 1: Customer Rideshare Request (Voice Mode + Chat Mode)
Problem: Users in Ibiza need fast, reliable taxi access without opening a separate app or navigating a map. They want to say "taxi to the beach club" and have it happen. But they also need to see the driver, know the ETA, and confirm pickup. Voice-only interaction is not always practical (loud venues, hearing loss).
Solution:
The Customer App opens to a full-screen Dojo avatar in the center, with a "Hold to Talk" button below and a chat input below that. Users can choose:
Via Voice: Tap and hold to speak: "I need a taxi to Pacha." Dojo processes the request via NLP, confirms pickup and destination via text-to-speech, and asks for confirmation ("Does that sound right?"). User says "yes" or taps a confirm button. Dojo transitions to a real-time map showing the assigned driver, vehicle details, and ETA. The user can continue asking questions via voice ("How long?", "Can I cancel?") or wait. During the ride, the map stays visible with driver location and ETA to destination.
Via Chat: Tap the chat input box and type "I need a taxi to Pacha." Dojo responds in text, confirms details, and shows a "Confirm" button. User taps to confirm. Transitions to same real-time map as voice mode. User can type follow-up questions or wait.
Why it works:
- Matches environment: Voice-first in a loud setting. Chat as fallback.
- Clear primary action: Dojo is unmissable; the hold-to-talk button is large and near the thumb.
- Reduces friction: No map selection, no typing a destination unless necessary. Natural language is faster than tapping.
- Real-time transparency: Once a driver is assigned, users see live location and ETA, building confidence.
- Accessibility: Voice + chat handles different user needs (hearing, preference, noise context).
- Fallback UI: If the user needs to see options (multiple drivers, pricing, pickup alternatives), the system surfaces those via cards without breaking the agentic feel.
Flow 2: Customer Experience Booking (Voice Mode + Chat Mode + UI Flow)
Problem: Booking a table, a yacht, or a VIP area requires calling, WhatsApp, or navigating multiple apps. Each venue has different rules: party size, timing, pricing, deposits. The user needs to understand availability and pricing before confirming.
Solution:
All three modes handle this, but with different approaches:
Via Voice or Chat: Same entry point: "I want dinner for 4 tomorrow at 10 PM." Dojo asks clarifying questions if needed ("Any specific cuisine?", "Budget range?"). Dojo returns a curated list of 5–8 available venues (curated by Dojo based on availability and preferences). Each venue appears as a card with name, cuisine, price, and rating. User can tap a card to see full details (photos, reviews, menu, cancellation policy) or say "first option" / type "show me the expensive ones." Once selected, user confirms, provides payment info (if new), and receives confirmation.
Via UI Flow: Select "Reserve Restaurant" from home. Enter party size, date, time. See ALL available restaurants (not curated by Dojo) with advanced filters (cuisine, price range, distance, rating, dietary restrictions). Scroll through results. Tap a restaurant to see full details. Confirm booking.
Why it works:
- Voice/Chat: Conversational discovery, reduced decision fatigue (Dojo curates), premium concierge feel.
- UI Flow: Transparent browsing, full control, familiar mental model for users who like comparing options.
- Consistency: Party-size → date → time → confirm flow repeats across all experience types (restaurants, yachts, beachclubs, nightclubs), so users learn the pattern once.
- Choice: Users choose their mode based on context (quick decision = voice, researching = UI Flow).
Flow 3: Driver App (Real-time Request Acceptance and Navigation)
Problem: Drivers in Ibiza need to see incoming ride requests, accept them quickly, and navigate to pickup without fumbling through maps or missing requests. They're on the road; safety and clarity matter.
Solution:
The Driver App is notification-first. New ride requests appear as full-screen cards with pickup location, destination, passenger name, rating, and estimated earnings. The driver taps "Accept" and transitions to navigation mode: large, clear turn-by-turn directions to pickup. Once the driver arrives, they confirm pickup (passenger gets notified), and the app guides them to destination. The interface uses large text, high contrast, and minimal decision points to avoid distraction while driving.
Real-time map shows passenger's exact location on pickup, enabling drivers to confirm "I see you at the corner" without guessing.
Why it works:
- Low cognitive load: Requests are the main thing; everything else is secondary.
- Safety-first: Large buttons (≥48px), minimal text, clear navigation reduce driver distraction.
- Real-time status: Passenger and driver are always aligned (pickup confirmed, in-transit, arriving).
Flow 4: Fleet Manager Portal (Real-time Operations Oversight)
Problem: Fleet managers need to deploy vehicles efficiently, track driver assignments, and monitor compliance. Without real-time visibility, they're managing by intuition or delayed reports.
Solution:
The Fleet Manager Portal opens to a map view showing all fleet vehicles in real time, color-coded by status (available, in-transit, offline). A heatmap overlay shows where ride requests are concentrated across the island, so managers can anticipate demand and reposition vehicles. Below the map, a list view shows vehicle details, assigned drivers, and current ride (if active).
A separate Compliance page shows all drivers and vehicles with document expiration dates: driver licenses, vehicle registration, insurance. The manager gets notifications when documents are expiring (30 days before, 7 days before). This addresses a gap in Uber's model: Uber doesn't manage compliance; helloDojo does because the Ibiza market requires it.
Why it works:
- Operational clarity: One view of the entire fleet state.
- Demand-driven deployment: Heatmap data helps managers place vehicles where users actually need them.
- Compliance automation: Reducing manual document tracking reduces legal and operational risk.
Design system / Design engineering
The design system was built to serve 4 distinct products (Customer App, Driver App, Fleet Manager Portal, Vendor Portal) while maintaining visual and behavioral consistency across three interaction modes.
-
Components: Defined 30+ reusable components (buttons, cards, input fields, modals, lists, maps, real-time status indicators, voice UI overlays, option cards for Dojo responses, confirmation dialogs, venue cards, date/time selectors). Components were designed with multiple states: default, hover, active, disabled, loading, error, success.
-
Tokens: Color palette (primary Dojo blue, neutrals, semantic colors for status: green for success, red for error, yellow for warning), typography (font scales for headlines, body text, captions, micro-text), spacing (8px base unit system), shadows, border radius, and animation timing. All tokens were exportable to Tailwind config for consistency across React builds.
-
Variants: Each component had variants for different contexts. Button variants: primary, secondary, destructive, ghost. Card variants: booking option, driver status, compliance warning, venue option (for Voice/Chat mode). Input variants: text, voice transcript, error state. Modal variants: confirmation, informational, action-required.
-
Responsive behavior: Mobile-first design for Customer App and Driver App. Desktop-first for portals. Breakpoints at 375px (mobile), 768px (tablet), 1024px (desktop). Components reflow intelligently: buttons stack on mobile, align inline on desktop; maps expand/contract based on viewport; text scales appropriately.
-
Visual distinction for interaction modes:
- Voice Mode / Chat Mode: Full-screen dark background (black to near-black), gradient accents (red-to-blue for Dojo avatar), high contrast. Signals "conversation and focus."
- UI Flow: Elevated background (light gray or white), more breathing room and spacing, structural clarity with steppers or tabs. Signals "browse and explore."
- Both share the same component library, so the shift between modes feels intentional, not fragmented.
-
Engineering handoff: Defined component specs in Figma with clear prop documentation, state combinations, and usage rules. Created a Figma MCP integration so AI coding assistants could read component definitions, tokens, and constraints directly. This reduced back-and-forth: the AI model had the design spec embedded in the code generation prompt.
-
Production considerations:
- Real-time map rendering: Maps required performance optimization (canvas rendering, clustering, virtual scrolling for large datasets).
- Voice UI states: Special consideration for voice interaction: loading spinner during speech-to-text, transcription display, Dojo "thinking" state (animated avatar or pulsing text), confirmation prompts.
- Voice-to-UI transition: When Dojo displays option cards in Voice Mode, the avatar shrinks and moves up smoothly; options appear below. Reverse animation when user taps an option and conversation continues.
- Fallback accessibility: When voice fails, UI must be clear enough to work as standalone. No relying on voice labels alone.
- Dark mode support: Voice/Chat modes are inherently dark. UI Flow supports both light and dark. Ibiza's nightlife context meant dark mode was essential, not optional.
- Touch targets: Driver App buttons had to be ≥48px on smallest dimension (accessibility + safety while driving). Voice Mode buttons similarly large for thumb-friendly interaction.
- Chat interface: Persistent chat input box visible in Chat Mode home; conversation history scrollable; option cards render below Dojo's text responses.
Frontend / implementation layer
The frontend was built in React + Tailwind using an AI-assisted workflow. This was not a prototyping exercise; it was production code shipped to TestFlight. The architecture had to support three distinct interaction modes (Voice, Chat, UI Flow) while maintaining code reuse.
-
Build involvement: I wrote production React code, not prototypes. Used Cursor (AI code editor) with Claude Opus for planning, GPT for code review, and Sonnet/Codex for component implementation. The workflow was: define component spec in natural language → AI generates React component → I review for edge cases, accessibility, performance → iterate. For complex state machines (Voice Mode transcription, Chat conversation history, UI Flow filtering), I worked closely with the AI to clarify product logic before generating code.
-
Tools and workflow: React, Tailwind CSS, Lucide React (migrated from Heroicons due to icon library constraints), React Query for server state, Zustand for client state, Mapbox GL for real-time maps, Stripe React components for payments, Web Speech API (browser voice recognition) for Voice Mode. All via Cursor + Claude MCP for Figma (design tokens and component specs read directly into code).
-
Product logic clarified:
- Mode selection logic: Home screen detects user's choice (Voice, Chat, or UI); routes to appropriate state machine; maintains session across mode switches (user can start in Voice, switch to Chat, but conversation history and context preserved).
- Voice interaction state machine: Listening → Processing → Dojo responds with audio + text → Show options as cards → User taps or speaks → Confirm → Transition to payment/confirmation.
- Chat state machine: User types → Dojo responds in text → Show options as cards → User taps or types → Confirm → Transition to payment/confirmation.
- UI Flow state machine: Select service → Enter details (party size, date, time) → Filter/browse results → Tap option → Confirm → Payment.
- User sharing logic: Backend detects returning user, app opens existing account without re-auth.
- Ride state machine: Dojo/Chat/UI Flow → request pending → driver assigned → in-transit → completed → rating.
- Booking state machine: Dojo/Chat/UI Flow → options display → user selection → confirmation → awaiting venue acceptance → confirmed/rejected.
- Voice interaction fallback: If NLP confidence is low or speech-to-text fails, UI prompts user to select from suggestions, tap options, or switch to Chat/UI Flow.
- Real-time map updates: WebSocket connection from Customer App, Driver App, and Fleet Manager Portal to backend; vehicle positions and ride status update every 5 seconds.
-
Design-to-code bridge:
- Figma MCP integration: AI model reads Figma component definitions, tokens, and constraints and generates code that adheres to design system rules.
- Component specs: Every component had a Figma frame documenting props, states, and usage patterns. For complex components (venue cards, option selectors, voice UI overlays), specs included behavior rules ("avatar animates up when options appear").
- Token consistency: Tailwind config generated from Figma tokens (colors, spacing, typography). No manual color hex codes in component files.
- Responsive rules: Defined viewport behavior in Figma annotations; Tailwind media queries generated from those rules.
- Interaction specs: Dojo voice UI, avatar animations, map animations, real-time status updates all defined in Figma; implemented with matching timing and easing.
- Mode-specific styling: Tailwind classes prefixed by mode context (
dark:bg-blackfor Voice/Chat,bg-gray-50for UI Flow) to maintain design intent across modes.
-
Quality control:
- Staging environment matched production design closely; caught regressions (color shifts, spacing misalignment, missing states).
- QA tested: voice interaction edge cases (loud environments, unclear speech, network delays, speech-to-text accuracy); UI fallbacks (what happens when voice fails); Chat mode performance (lag in typing, image loading); UI Flow filtering and performance on large result sets; real-time map rendering and vehicle update frequency.
- Production UI review: spot-checked real-time map rendering, voice state transitions, error messages, avatar animations, and mode switch smoothness.
- Regression prevention: component library kept each UI consistent; changes to a button, card, or selector propagated across all 4 products and all 3 modes.
Design in Figma
Components, tokens, all states
Figma MCP
Spec read directly into prompts
Cursor generates React
Opus plans, Sonnet/Codex builds
Review
Loop until edge cases and a11y are clean
Wire APIs
Mapbox, Stripe, Web Speech, backend
Test across 4 products
Ship to TestFlight
Collaboration
The project involved close coordination between myself, the CTO, and the two founders.
With the CTO: I worked closely with the CTO to translate product logic into APIs and backend endpoints. When I designed the Voice Mode, the CTO had to implement speech-to-text integration, NLP processing, and fallback handling. When I designed the unified booking flow, the CTO had to ensure that restaurants, yachts, nightclubs, and beachclubs all shared the same offer and availability endpoints. When I designed the three interaction modes, the CTO had to ensure that Voice Mode, Chat Mode, and UI Flow all converged on the same payment and confirmation logic. We aligned on data structures early: what information needed to be returned from each API, what states the backend would manage, what validation would happen server-side vs. client-side. This reduced rework.
With the CEO: The CEO had validated rideshare demand and pushed for rapid execution. I advocated for a phased approach (rideshare first, then expansion) but the pressure was to build everything at once. I documented the trade-offs (quality, maintainability, time-to-launch) but ultimately, we shipped all 4 products in parallel. This was the source of constraint; it worked out, but it was high-risk.
With the designer (subcontracted design systems lead): I provided the component specs, token rules, design principles, and mode-based visual hierarchy. The designer organized them into a scalable system, automated token generation, and created Figma documentation that AI models could read. This division of labor worked: I owned product, UX architecture, and systems thinking; the designer owned system implementation and scalability.
With QA: QA tested voice interaction (speech-to-text accuracy in noisy environments, fallback UX when voice fails), Chat Mode conversation flow and latency, real-time map behavior (performance, update frequency), UI Flow filtering and search, and edge cases (network failures, failed bookings, confirmation timeouts, mode switches mid-booking). I participated in staging reviews to validate that production UI matched design intent and that all three modes felt cohesive.
Product alignment: I created a product spec document that outlined all 4 user types, their journeys, three interaction modes, key flows, states, and edge cases. This became the reference for engineering and the founders. Clear documentation reduced ambiguity and speculation.
CTO
APIs, speech-to-text, unified booking endpoints
CEO
Rideshare validation, scope pressure, business calls
DS lead (subcontract)
Token automation, AI-readable Figma docs
QA
Voice edge cases, mode switches, real-time map
Results and impact
The project is in TestFlight (private beta) with all systems functional and production-ready. However, the product has not launched publicly, and therefore mature usage metrics are not available.
Results:
- Shipping complexity: Delivered 4 interconnected products (Customer App, Driver App, Fleet Manager Portal, Vendor Portal) plus design system, branding, and landing page in 6 months, all at production quality. Most critically, shipped three distinct interaction modes (Voice, Chat, UI Flow) that all connect to the same booking engine, requiring significant UX and architectural sophistication.
- Three-mode interaction design: Validated that voice-first, chat-fallback, and traditional UI could coexist as equal-footing options while maintaining product cohesion. Each mode is visually distinct (dark for Voice/Chat, elevated for UI Flow) yet uses the same components and flows.
- System integration: User sharing, real-time vehicle tracking, offer management, compliance tracking, and multi-mode booking all function as designed.
- Product clarity: Created a unified UX model where rideshare, dining, yachts, and events operate under the same interaction paradigm (Dojo-centric, three modes, UI fallback) while maintaining service-specific nuance.
- Design system adoption: Design system reduced component fragmentation; all 4 products and all 3 interaction modes use the same component library, tokens, and responsive rules.
- Frontend foundation: Built a production React codebase that uses design tokens, reusable components, mode-based state machines, and AI-assisted workflows; this foundation makes future iteration and scaling faster.
- Operational readiness: Fleet management, compliance tracking, and vendor tools are operational and ready to scale; no late-stage surprises in the operational layer.
The strongest evidence of impact is the breadth of the system shipped, the sophistication of the three-mode interaction design, and the reduction of ambiguity from abstract product idea to concrete, functional product ready for real users. The work proved that an agentic, multi-mode concierge model is viable and buildable at scale across multiple user types and service categories.
What I would improve
-
Phased launch, not big bang: The CEO's insistence on shipping everything simultaneously introduced risk. In retrospect, launching rideshare validated, gathering user feedback on voice interaction fidelity, mode preferences, and demand patterns, then expanding to dining and experiences would have de-risked the product. The current model ships untested across 4 use cases and 3 interaction modes.
-
Voice interaction validation: Without user testing, I don't know whether users prefer voice or chat in real contexts, whether speech-to-text accuracy is sufficient, or whether the Dojo persona resonates. A closed beta with voice-specific instrumentation (mode preference metrics, success rates, dropout points, user feedback) would validate the core bet. Do users actually prefer voice, or do they use it only as a fallback?
-
Mode switching behavior: I kept users in Voice/Chat mode to preserve the agentic feel. But some users might want to see all 50 restaurants upfront before deciding. I'd test whether users want easier transitions between Voice/Chat Mode and UI Flow, or if keeping them separate is actually better for focus.
-
Offer system design: The unified offer-creation system was a compromise. Service-specific role types (restaurant vs. yacht vs. nightclub) with tailored booking logic would reduce workarounds. The current system works but requires more configuration by venues than necessary.
-
Mobile web parity: The Driver App and Customer App are iOS-only. A mobile web version would extend reach and reduce platform fragility.
-
Compliance automation: Document expiration tracking exists, but automated renewal reminders and direct integration with official databases (driver licenses, vehicle registration) would reduce manual overhead for fleet managers.
-
Demand prediction: The heatmap shows current demand. Predictive models (ML on historical request patterns) would help fleet managers deploy proactively rather than reactively.
-
Dojo personality depth: Dojo's current personality is minimal (helpful, responsive). With more time, I'd develop Dojo's voice more fully: regional Ibiza references, a point of view, humor, personality that makes the interaction feel less generic.
Phased launch
Rideshare first, then expand
Voice validation
Closed beta with mode-pref metrics
Mode switching
Test easier Voice/Chat → UI transitions
Offer system redesign
Service-specific role types
Mobile web parity
Reduce iOS-only fragility
Compliance automation
Renewal reminders, DB integrations
Demand prediction
ML on historical request patterns
Dojo personality depth
Voice, humour, regional references
Final reflection
helloDojo proved that I can take a vague product vision and ship a production-ready, multi-sided system that balances user experience, operational logic, system thinking, and implementation feasibility. The three-mode interaction design (Voice, Chat, UI Flow) demonstrates particularly strong product thinking: rather than forcing users into one mode, I created three paths that serve different contexts and preferences, all unified under the Dojo brand and the same booking logic.
The project demonstrates what "Product Design Engineer" means in practice:
- Product thinking: Understanding Ibiza's ecosystem and transforming "Uber with AI" into a coherent agentic model required product strategy, not UI design. Recognizing that voice-only wasn't practical and designing three distinct modes instead was a product decision, not just a UX feature.
- System thinking: Managing 4 user types, 5+ service categories, real-time state, three interaction modes, and design system consistency required architectural decisions and trade-offs, not just designing screens.
- Implementation awareness: Building production React code using AI-assisted workflows and design-to-code bridges meant thinking about what's buildable, performant, and maintainable, not just what looks good in Figma.
- Design engineering: The design system, token structure, component specs, and mode-based visual hierarchy had to be built so that AI models could read them and generate consistent code. This required thinking about how design intent translates into implementation rules. Mode-specific styling and state management required coordination between design and code.
- Shipping under constraint: 6 months, ambiguous business direction, team changes, pressure to do too much, three interaction modes simultaneously—and still shipping a coherent, production-ready product without cutting corners on quality.
The work proves that I can move fluidly between product discovery, UX architecture, visual design, system thinking, frontend implementation, and team coordination. I don't just execute; I shape product decisions. I don't just design; I build. I don't just build; I think about scale, reusability, and long-term foundations.
helloDojo's ability to unify rideshare, dining, yachts, and events under a single Dojo interface across three interaction modes, and the production readiness of all 4 systems, is the proof. The three-mode design in particular—Voice for premium hands-free booking, Chat for accessibility and preference, UI Flow for browsing and control—shows that I can think beyond "design the UI" into "design the interaction model" and execute both.
Next case study
UMA