Youdao Document FM, today, I will take you to experience this seriously underrated AI podcast artifact in depth.
As we all know, I was very efficient during the May Day period, and I published two articles to introduce Google’s NotebookLM new Chinese podcast function, I once thought that NotebookLM, in terms of Chinese podcast function, is invincible all over the world.
However, the residual temperature of my article remained, and another product came into my sight.
The cause of the incident was that I participated in the “Let’s Make a Friend with AI” event held by a friend in Hangzhou.
I came to the scene early, signed in, and participated in the lottery.
When I saw the prize I had drawn, I was stunned.
Youdao Document FM 30-day membership card.
I know, I know the document, and I know FM, but when these three words are combined, I feel a little unclear.
What is this, why have I never heard of this product?
But maybe there was some kind of bond in the dark, and after I was stunned for half a second, I felt something was wrong.
Literally, this seems to be an FM software?
After thanking the friends in charge of distributing the gifts, I received the gift card containing the redemption code in my pocket and never cared about it again.
The atmosphere at the event was very warm, and I also benefited a lot from the sharing of the teachers, and I also won a folding wireless charger from Zhipu.
After that, I forgot the gift card in my pocket.
After the event, Yun Shu and I had a meal, and then took a break to talk about life and ideals.
When I returned to the hotel, because the gift card felt too similar to my room card, I almost got confused and put the gift card in the door sensor area.
At this time, I remembered that I still had a gift card from Youdao Document FM in my pocket.

Open the mobile application market, find the APP of Youdao Document FM, and download it.
The moment I clicked on the software, I knew why it was called “Youdao Document FM”.

The interface of this software is very simple, so simple that it puts all the functions on one page, but it allows you to see what it does at a glance, as the name suggests.
The whole page can be roughly divided into two parts.
1. Functional area – what it can do
This podcast software has five functions, namely: text to podcast, link to podcast, photo to podcast, file to podcast, and recording transcription.
1. Text to podcast
This function can convert the text you enter into podcast content, such as some inspiration you write in note-taking software, the content of articles you write, or even an inspirational sentence you see while surfing the Internet.
With the mentality of “flirting”, I entered two words in the text box: “Hello”. I thought it would refuse to process it, or it wouldn’t respond at all, but after I clicked next, selected the tone and the type of podcast I wanted to output, it actually started running.

Moreover, the generated audio is actually a 4-minute audio, which is quite rich.



You can see that the generated page, in addition to the original content you entered, also has three functions, podcast subtitles, summary, and brain map.
It is worth mentioning that the structure of the brain map is still very clear, even if you don’t know the context at all, through this brain map, you can probably know that it is inseparable.
2. Link to podcast
Enter or paste a web link, AI will read the content, convert it into a podcast, and the supported links are:
Public links (without reading permissions), WeChat official accounts, Zhihu, Tencent documents, Feishu documents.
I tried to paste the link to the official account I sent yesterday, and after waiting for a few minutes, it generated an audio for me. (It’s the one I told you at the beginning)


It is worth mentioning that in the lower right corner after generation, there is an export button, you can choose to export the podcast script and audio to your phone, or just choose one of them, which is quite intimate.
And – not only this feature has these two export options, but other features have these two export options.
3. Take photos and turn them into podcasts
The student party should have such a pain point: during class, the teacher does not give PPT, and the lecture pace is fast, resulting in no time to take notes, what should I do?
Then this feature is just for you!
Youdao’s powerful technical strength is the base, you can take a PPT first, save it to your mobile phone, and then click on Youdao Document FM after class, use this function, paste all the PPT photos you take, extract the text and generate a podcast, you can choose 9 pictures at one time.
If your phone has good pixels, then I suggest you directly use the system camera to take photos and import the album pictures. If you want convenience, you can also use the software to shoot with its own camera, its OCR technology is indeed quite mature, even if the photo is a little blurry, it can accurately help you identify the text inside.



We can see that when you finish taking a photo, click “Select All” (it is not recommended to use “Auto Box”, because that box cannot be dragged, and it is easy to identify the parts you don’t need to recognize), the text in the photo is clearly extracted, and then click “Next” to generate a podcast.
This is simply the gospel of the student party! From now on, my mother no longer has to worry about you taking notes and making your hands sore!
4. File to podcast
If you’ve used this feature, you’ll know why I say “slap in the face again” because it’s really – Thai pants are spicy!
It supports common document types on the market – pdf, doc, txt
In this regard, it is basically aligned with NotebookLM’s Chinese podcast. What is slightly better than Chinese podcasts is that the duration of the podcasts it generates is different in different usage scenarios, which I will explain in detail later.
After selecting the document, according to your needs, generate podcasts with different explanation depths, choose the anchor tone you want, the characters can be one male and one female, it can be two women, or two men, at this point, the advantage of Chinese-speaking countries is reflected – the degree of customization is high.

I chose the “in-depth lecture” mode. It actually generated a half-hour audio for me, and it was all dry goods, without any moisture, which completely allowed you to commute to and from work, from the time you walked out of the office to hear you all the time you got home. To be honest, when I first saw this time, I was shocked.
5. Recording transcription
Temporarily pulled by the leader to a meeting, nothing was prepared?
Don’t panic! At this time, you only need two steps to summarize a meeting minutes and easily get the appreciation of the leader!
Then, you can put down your mobile phone, look at the leader confidently, bring him emotional value, and after the meeting, directly integrate the transcribed notes and send them to the leader, and the leader will praise you when they see it: “You are a talent!” ”
Although it is not written to support mobile phone internal recording, how can this be me, as a little genius?
If you want to try to turn on recording first and then switch software in the background, you’ll be disappointed when you return to FM again with great anticipation – because it automatically stops recording when you switch software.
Here’s how I did it:
1. Shrink the software to a small window
2. Click on the video/online meeting software
3. Turn on the speaker to the outside
At this time, you will find that the sound in the video can actually be transcribed into text!
But there is a premise that this method requires you to do it in a relatively quiet and free of interference, otherwise the ambient sounds around you will also be recorded, resulting in a failure.

How is it, isn’t it awesome? Praise me (with a proud face)!
There is another feature that must be mentioned.
How can a document FM software be without the blessing of a large model?
Do you see the box at the top of the ribbon, click in, enter the question you want, you can “go beyond the three realms, not invisibly”, and send the content you want to generate directly to Teacher D, it can not only answer your questions, but also at the end of its reply, you can also generate a podcast according to the reply content, without you having to read the text at all, just listen with your ears, and the knowledge will enter your head.
For example, I asked Teacher D to tell me about the history of AI:


The podcast content it generates not only completely follows the reply given to me by Teacher D, but also speaks very thoroughly, really bringing you to think.
After talking about the functional area, let’s talk about the audio area, which is the generated audio area.
2. Audio area – what is it?
The audio section has the following content sections:
Mine (audio generated by you in the ribbon), subscriptions, English follow-up, exam memorization, and listen to good books.
Let me explain to you one by one, what are these sections for?
1. Subscription
If you pay attention to domestic and foreign news events, you can choose to subscribe to the “Global News” and “Domestic News” sections, which will summarize the information of the previous day in front of you the next day, you can choose the appropriate timbre to explain the news for you, just like carrying two professional news anchors with you.
- If you are interested in the economy, you can choose the “New Consumption” and “Business” sections, and the software will push hot financial information to you, allowing you to easily grasp the pulse of the market.
- If you are particularly concerned about health, then you can subscribe to the “Health” and “Life Sciences” sections, and from then on, you also have the right to speak in front of “health experts” such as the seven aunts and eight aunts.
- If you are a parent or a practitioner in the education industry, you are definitely an essential source of information in the “education” section.
- If you are also learning AI and want to get the latest AI-related news, subscribing to the “AI Trends” section can help you find a foothold in the ever-changing AI field.
- If you are a fan or a sports enthusiast, you can also subscribe to the “Sports” section to find out what is happening in the sports world.
Or, children only do multiple-choice questions, and you want them all.


2. English follow-up
If you are learning English, you are in luck! We all know that Youdao has its own strength in English learning, and Youdao dictionary has also become a help for many friends on the road to learning English.
Here, Youdao has prepared a limited, but rich English audio, you can click pause, read and learn after a conversation is broadcast, and for some words that are not very clear, you can also drag the time bar and play it back repeatedly until you learn it.

3. Carry a back for the exam
Are you who are taking the UGC exam and often feel at a loss when faced with a thick stack of materials?
Carrying a back for the exam is a necessary tool for friends who are preparing for the primary school teacher qualification certificate.
Similar to the above “English Follow-up”, it is also realized by manual pause, you can manually pause after a piece of content is broadcast, and then read it by yourself to see if you have mastered it all, what are the missing knowledge points, to help you review.
In this section, Youdao has prepared general materials for primary school teaching materials, including basic knowledge, teacher ability, cultural literacy, etc., which can be described as very friendly.

4. Hear good books
Want to grow personally but don’t know which books to read?
Youdao has already sorted it out for you!
“Poor Dad, Rich Dad”, “Seven Habits of Highly Effective People”, “Sleep Revolution”….There are a total of 8 books in this section, all of which have been converted into audio, you can click directly to listen to them, or you can read the summary or brain map first, and listen while watching to deepen your understanding.
When you have an understanding of the content of these books, it is much more efficient to read the whole book intensively.

In the process of use, Youdao Document FM has some functions, which make me feel amazed, let me tell you about it.
3. Highlight analysis
1. Support multiple languages


Output speech intelligence supports 10 languages in different countries/regions, not only Chinese, but also covers most usage scenarios.
Under Chinese voice, it supports three accents: Mandarin, Hong Kong Pu and Taiwanese accents.
Under the English voice, it supports four types of American accent, British accent, Australian accent, and Indian accent, which can be described as very diverse.
All other languages are the default accents.
Don’t underestimate this feature. The reason why I think it has a bright spot is because it has a wide range of usage scenarios.
- For example, if you are learning English, you can throw a news material to it and let it explain it in English to exercise your English listening.
- Alternatively, you can use the method of reading along to strengthen your speaking skills.
- For example, if you are a podcast creator, you can also publish the content you produce to foreign podcast platforms to attract foreign listeners without needing to know the language yourself.
- Another example…
You in front of the screen must be smarter than me and can dig out more needs.
Welcome to post your ideas in the comment area, and we will communicate together.
2. More than 100 anchor tones


There are always so many timbres back and forth, auditory fatigue?
Youdao Document FM has carefully prepared more than 100 tones, whether it is a mature style of precious mature men, intellectual mature women, or youthful young men, young school girls… A variety of voices for you to choose from.
Difficulty choosing? Inexistent!
It is worth mentioning that as a product of a Chinese company, it also has the most support for Chinese timbres, with 43 types, accounting for almost half of the total number of timbres, followed by English timbres, with 25 tones, which well meet the listening needs of different groups of people.
3. 20 explanation modes
Different groups of people have different needs for different types of information.
For example, when faced with a long text document, some people want to read it quickly and absorb its essence;
Some people like to read intensively, and the more detailed they are, the better, so that they are easy to understand;
Some people want to listen to novels, listening and listening, knowledge is like water, gently flowing into the brain;
Some people have critical thinking and hope to disassemble and reconstruct information in speculation.
The above requirements can be completed. I found that Youdao Document FM has set up 20 explanation modes, whether you like in-depth lectures, document speed reading, or bedtime stories, debate events, so many styles, there is definitely one you like.
We need to note that when choosing different explanation modes, the generated audio duration is also different, and some even vary greatly. For example, speed reading and in-depth lectures have different content and very different audio durations.

4. Super anthropomorphic anchor dialogue
In the audio at the beginning, you can hear that this podcast really achieves a very high degree of anthropomorphism, and it is transformed according to different scenes.
For example, when the podcast type is an interview, the anchor’s voice will pause in thinking, as if he really needs to read the manuscript or think carefully.
When the podcast type is a talk show, there is a bit more “improvisation”, for example, an anchor is talking and is suddenly interrupted by his partner’s interject.
When the podcast type is in-depth reporting, it is obvious that a more formal tone is needed, and at this time, the details come – AI anchors will also imitate the heavy breathing of human anchors when reporting news.
What’s even more amazing is that when you listen to the conversation of the virtual anchor on this mobile phone, there is no sense of disobedience at all. In the small talk scene, their voices are also very relaxed, and they will also “add drama” to themselves——— In order not to make us feel strange, it will even imitate the structure of the podcast program to make the experience more real, mention it at the beginning, saying that this is their xth issue, and it is also natural to end at the end, and will say goodbye to you next time.
Really, when I listened to the first audio I generated in this software, I was blown away:
The boundaries between virtual and real seem to be beginning to blur.
5. Open source preparation
During the experience, I also learned one thing – this project has plans to be open source, and the technical staff of their team is already doing this, so we can look forward to it.
4. Some supplements
Some friends may say: You praised so much, did you receive a lot of money?
In order to prove that I don’t have a proper meal, I will also tell you some of the areas I have encountered during use that I think this product needs to be improved.
First of all, you see that my experience is pretty good, there are more than 100 kinds of voices, and there are various timbres for you to choose from, and you can also generate more than 30 minutes of audio…
But, stop, let me break the cold water for you.
All this because I redeemed their membership.
It can be said that if you charge as a member, your experience will be very good and there will be no problems. But if you’re not a member, you might feel blocked everywhere – just look at the table below.
Here’s a screenshot from their member center. We can see that the difference between ordinary users and members is quite large.

Secondly, although it supports many document formats, there are still some areas that need to be optimized compared to NotebookLM.
Finally, there is also a limit to the size of uploaded documents, and you can only transfer documents of about 20MB at most.
The price of membership is not too expensive, which is still very good, even if it is an annual subscription, it is only the price of a power bank (or a power bank).
You may say that all the functions of this software can be implemented with an MCP, and it may be better to do it.
However, what I want to tell you is that if you are a beginner who is not strong in hands-on ability, just watching those various tutorials, the time has been long enough, not to mention downloading software and hands-on practice, for beginners, the learning cost of MCP is still very high, which is a natural threshold.
And this software does not require you to have any learning costs at all, because all its functions are on one page, you can roughly understand at a glance, what each function is for, so that you can use it when you get it, which is its highlight.
epilogue
NotebookLM and Youdao Document FM, at present, these two are not on the same order, but please be careful! NotebookLM will be launched at the end of 2023, while Youdao Document FM will start public beta on March 25, 2025, more than a month ago, and will be open to the public.
To be honest, it is already very good to be able to do this in more than a month.
What we should pay attention to may not be whether Youdao Document FM can compete with NotebookLM, which is better or worse between the two, but –
Major domestic manufacturers have begun to make efforts to make an AI-native podcast software that is more suitable for Chinese physique.
Even if Youdao Document FM is not as good as imagined, we bless it, hoping that it can catch up and do what they claim to do as soon as possible——
“NotebookLM for China”.
I believe this day will not be too far away.
At this point, the text suddenly felt a little dazed. What exactly does AI podcasting mean to us? It not only means that you can achieve podcast freedom, but also from the bottom, the reconstruction of the knowledge system, even if due to technical limitations, it still belongs to the “information fast food”, but when major manufacturers start to make efforts, how long will it take for the barriers of traditional knowledge to be broken?
I’m curious.
Thanks for reading this. If this article is helpful to you, please like and recommend it to your friends.
What are your thoughts on AI podcasts and Youdao Docs FM? Welcome to communicate in the comment area.