TL;DR: WeShop 1.5 makes meaningful progress in preserving product details and increasing freedom in background transformation, expanding the range of usable scenarios. For core issues such as faces, color, lighting, perspective, hands, and multi-person images, we still need to wait for version 2.0.
As usual, let’s look at some results first:
Turning original backgrounds into studio shots
Switching between outdoor and indoor backgrounds
Mannequins
Magazine covers
Product still life, preserving lighting and edge details
For more examples, please visit: WeShop 1.5 release — WeShop
A necessary disclaimer: all views below are personal views only. They concern AI application startups, not companies competing in foundation-model races. These two kinds of companies follow very different startup models and should not be mixed together.
Two important characteristics of AI-native companies
For me, what counts as an AI-native company is a crucial question. If AI-native companies do not exist, then teams like ours building AI applications will have no long-term commercial value; we would merely be doing trial and error for large companies. With this wave of large-model development, I personally believe AI-native companies will exist, and native commercial giants will grow out of it.
Based on WeShop’s practice, I will irresponsibly summarize two core characteristics of AI-native companies:
- The R&D process follows the principle of Prompt > LoRA > Finetune
When we discuss AI-native companies, we are really discussing where the dividend of this AI wave lies. Startup teams have no abundant manpower, capital, or resources. Their innovation must be built on relatively low trial-and-error costs, and prompting is the lowest-threshold innovation method that large models provide. In reality, many teams severely underestimate the potential of prompts and have not tested the boundaries of what large models can do with prompting.
Next is LoRA. Without LoRA, today’s community would not be as vibrant. In most cases, LoRA costs slightly more than prompting, but inside a team it can form a LoRA factory and enable assembly-line production. By integrating different LoRAs, teams can often create surprisingly strong product results.
Finally there is finetuning. In 2023, many friends came to me to discuss AI startup ideas and usually mentioned finetuning, buying machines, preparing data, and training a vertical model. That made me nervous. The difficulty and cost of finetuning are far beyond many people’s expectations; starting with prompts and LoRA is a more reasonable choice. Of course, after PMF, successfully finetuning a good model may be necessary to form a commercial moat.
2. Strong at single-point capabilities but weak at integration
In other words, current products are friendly to small customers. AI technology is still early, and there will inevitably be many problems — often with things previous players considered simple. In WeShop, for example, customers may ask: if the background can already be changed realistically, why can’t the shoulder strap on this dress stay thin, or why can’t a red dress become green? In LLMs, similarly, a model may write long essays, but why can’t it handle customer service properly?
For small customers, AI products solve core problems and bring exponential efficiency gains. But for large companies, because of scale and quality requirements, any step that cannot enter the business workflow reduces efficiency and is not worthwhile from an objective commercial perspective. This contradiction will be solved as technology improves and new workflows emerge, but it needs time. That gives many startup teams a window to grow. We must survive and grow before that critical point arrives.
AI applications urgently need hybrid product managers
I personally think this role is extremely scarce right now, and it is one of the key constraints limiting the emergence of excellent AI applications.
Product managers need sharp demand insight. Even in the mobile internet era, product managers with this ability were rare. Imagine a team with such a core person: their challenge is how to turn captured demand into a concrete application. Because of the uncertainty of AI technology, this process becomes unusually complex. In the mobile internet era, after a few rounds of PRD review, teams could determine whether a feature was feasible; there was little ambiguity. Whether the business could succeed on schedule was another matter. But AI development is different: you may produce a demo in half a day, yet spend half a year struggling to launch a product, not to mention growth, operations, and business-model building after PMF. If product and engineering teams cannot communicate effectively and align on standards, iteration efficiency will drop sharply.
Ideally, product people should also understand the characteristics of AI technology. Although AI technology has a lower threshold than past academic theory in some ways, building a coherent cognitive system still requires significant training and accumulation. This AI wave has arrived fast, and talent reserves are seriously insufficient. But after a year, I believe many people have gradually adapted and are quietly practicing in different ways. These efforts have not yet fully erupted, but we have reason to look forward to 2024.
Therefore, I hope people interested in product work actively embrace AI technology, dare to try, and dare to innovate. Especially at this stage, they should not blindly imitate. Innovation has the highest return on investment. Even if it does not succeed in the short term, the firsthand understanding gained through practice will significantly improve an individual’s and a team’s grasp of the AI industry.
We want to open source
The construction of WeShop has benefited from many open-source projects and friends in the Stable Diffusion community. After careful consideration, we decided that we should also contribute to the open-source community. We plan to gradually open source WeShop’s frontend, backend, and some model-training tools.
We have released the WeShop personal edition, which can be seen as a variant of the SD web UI. At this stage, we have only released part of the frontend code; it is not yet complete, and please forgive the code quality. Compared with existing WebUIs, we added task management, asynchronous execution, and remote multi-user access, making it more suitable for using SD in real commercial environments.
Because the team is small, full open source is still on the way. We need time to complete engineering work on the codebase, so it has not yet been released on GitHub. We have formed a dedicated group and will disclose the source code step by step. People interested in this are welcome to learn more through the Feishu document.
Feishu document: WeShop open-source notes
Some useful resources
Here are some SD- or LLM-related resources that I think are good.
Professor Li Jian from Peking University has a talk on LCM; the whole series is also very good.
Bilibili: Professor Li Jian on LCM
The mathematics of deep learning series from IDEA, founded by Harry Shum:
Bilibili: Mathematics in Deep Learning
Mu Li’s course. Unfortunately, he stopped updating after starting his company.
Bilibili: DALL·E 2 paper reading
Andrew Ng’s courses:
YouTube:
YouTube: https://www.youtube.com/watch?v=T0Qxzf0eaio
Song Yang:
YouTube: https://www.youtube.com/watch?v=y8q3gh61OY0
A broad overview:
YouTube: https://www.youtube.com/watch?v=cS6JQpEY9cs
The Lex Fridman interview with Ilya:
YouTube: Lex Fridman interview with Ilya
LLM product recommendations
There are many resources I did not include. Since I now use LLMs to help me understand text-heavy information more deeply, here are a few tools I often use:
For academic paper questions, ChatGPT tends to be too verbose, though this may also be because my prompting practice is not yet extreme enough.
For overseas use, I recommend Claude 2:
For domestic use, I recommend:
For everyday questions, I recommend ChatGPT overseas; domestically I recommend Doubao and Tongyi. Doubao’s voice interaction is quite good.
Other posts about WeShop
Wu Haibo: Reporting to everyone that our e-commerce AI model product WeShop beta is open for testing
Wu Haibo: Thoughts from building WeShop — written after the official WeShop launch
PS: If any of the product images above infringe rights, please contact me and I will delete them.