The UX research team at eBay New Product Development carried out a mixed-method approach to help launch a fully localized website in China—eBay Haitao. One of the biggest challenges for the research team of two (including me) was to empower the product team, which consisted 80% non-Chinese speakers, to produce a fully localized website for the Chinese imported shoppers.
However, the enormous distinction between the Chinese and the US shopper behaviors had undermined the validity of our original product roadmap and distanced the product team from the Chinese consumers.
To bridge this gap, the research team employed a quarterly benchmark program—a systematic approach which 1. identifies user issues across the site, 2. measures the predefined UX metrics, and 3. understands the why behind the metrics—in order to render the end-to-end user experience and guide product improvement.
The research team defined the qualitative metrics based on the result of foundational research.
Trustworthiness
Brand reliability
Return and money back guarantee
Seller and product authenticity
Engagingness
Breadth of Inventory
Variety of events
Convenience
Ease-of-use
Information clarity
I conducted unmoderated remote studies on both desktop and mobile websites via usertesting.com with a rolling panel of N=40 each quarter. Based on the pilot studies, we estimated the average problem occurrence probability to be 25%. This benchmark study gave us a 95% likelihood of detecting all problems at least once.
I first logged all of the observed user problems and transcribed their responses on to a sheet. Later, the issues were counted and grouped into different types.
The questionnaire helped collect the continuous scores (e.g. 1=strongly disagree to 5=strongly agree) of the key metrics and prompted the participants to describe the scores they chose. The bar charts below visualize the feature that worked well or needed improvements from a horizontal UX perspective.
The map broke down users’ experience on eBay Haitao into component parts to gain insights into problems and opportunities for improvements. With the user journey map, 1. I could guide the discussion to the parts would dramatically improve the user experience if fixed, rather than incremental changes, and 2. the leadership team could look at the user problems from a higher level and decided where to invest strategically.
Horizontal axis placement of user issues was based on UX data collected in last 6 months from new users; Vertical axis placement was discussed with product team to determine location; this diagram was used to develop concepts, assess feasibility and determine prioritization.
I used videos clips to communicate critical user problems on the internal slack channels. The video clips, as a supplement to texts, enriched research findings and brought user voices to life. Most importantly, stakeholders tended to take immediate actions to address the problems.
1. Incorporated three value propositions by running online quantitative survey (N=600) with stimuli to test various combinations (languages, icons) of the value propositions.
Value propositions
Top: 500 million global top selling goods
Middle: Reliable platform with 22 years of e-commerce experience
Bottom: 30-day money back guarantee
Survey stimuli
2. Placed “Sales amount” and “Direct shipping to China” badges to enhance trust for products.
Chinese consumers see the product sales amount and direct shipping badges as an indication of reliable product, because the former shows they are not the first persons to purchase the item and the latter justifies the product’s place of origin to be oversea.
Badges
Implementation
3. Surfaced negative product reviews and seller good review rate to increase site transparency.
It might sound counter-intuitive to Western consumers, but Chinese consumers prefer reading negative reviews since the good ones are likely to be not authentic. Also, seller good review rate helps justify the seller’s credibility.
Negative/positive reviews
Implementation
1. Increased the completion rate from 50% to 90%.
Before implementing phone registration, eBay Haitao had requested email address to create an account. However, the benchmark studies revealed severe user problems, including misinterpreting email address as home address and failing to separate first name and last name, which prevented users from completing the process and led to drop-offs. The product development team took my advice and implemented a more streamlined registration process using phone number instead of email address and changed the language from traditional to simplified Chinese.
The result was a dramatic lift of the completion rate from 50% to 90% and an improvement on the convenience benchmark score from 3.0 to 3.9 (on a 1-5 scale) with statistic significance.
Before: email registration After: phone registration
2. Planned registration QA study with N=70 subjects and managed local agency to conduct survey in Shanghai and Guangzhou. The study reported 0% failure rate in registration, 70% of respondent being satisfied (scored 8 out of 10) with the registration process, and identified potential area for improvement as only 51% respondents considered their personal information was protected.
QA study
1. Added visual element to celebrate the moment of placing order successfully.
Before After
2. Used micro-interaction to animate visual elements and attract Chinese consumers’ attention.
Discount animation Loading animation
Our original intend to conduct the benchmark study was to identify all user problems, prioritize them based on the severity to the user experience, and recommend solutions to the product development team. However, the validity of the benchmark scores were subject to small sample size studies (N=40) and we could not make statistically meaningful judgement on the efficacy of the new implementations. We would benefit more from benchmark scores with more rigorous consideration of sample size based on level of statistical confidence and power.
The biggest advantage of using a within-subjects approach is that I can detect differences between design metrics with a fraction of the user as compared to a between-subjects approach. It would save us time and cost in a fast-paced development environment.
In addition, within-subjects comparison also has a better control of variability than between-subjects because it uses the same group of subjects to take the studies. For example, expert reviews are likely to provide a more stable comparison because their expertise implies reduced performance variability.