Love On
A story about My Internship @Lenovo
Prequel: College
My first major in college was linguistics. I chose this major because of my passion in languages and etymology. But as I studied more and more of linguistics, I found that it is hard to land a job with this major except having knowledge of natural language processing. In this age of big data, people solve the problems of languages by machine learning to analyze and process human language. Therefore, I decided to transfer my major but kept linguistics as a minor. Since computer science major is capped in my university, I decided to transfer to applied mathematics major and would continue studying data analytics in grad school. Luckily I completed all my coursework in 3 year and a half. I could have used my last half year to complete my study of linguistics as a double major, but I decided to graduate earlier, because I thought that learning on the job as an intern is way more fruitful than learning at this overcrowded university. At the end of 2018, I graduated and came back to China to look for an internship before going to grad school.
Part I: A Fresh Start
In February 2019, I received two internship offers. One was from a small data engineering company based in my hometown Hangzhou. The company builds databases for the government. The other one was from Lenovo Research Artificial Intelligence Lab Natural Language Processing team based in its Beijing headquarters. Although the former offered me a better salary, and was more convenient for me because I needed not move to a new city or rent an apartment, I chose the latter for my original passion in natural language processing, and for a new adventure and life experience in Beijing. Even though I knew that this path was not easy, I walked on this journey with no regret.
However, getting prepared for this internship was way more complicated than what I expected. The chilly winter and the haze in Beijing did not look so welcoming when I landed at its massive airport. When I was looking for an apartment to rent on the next day, I was totally awed by Beijing’s terrible traffic during the morning peak. Even though the subway in Beijing is well-developed (a train in every 3 minutes), it is still packed with myriads of people. There were even lines for getting into the station, no need to mention the overcrowded lines for getting onto the train. Luckily I only needed a 30 min ride, but a ride for an hour in such overcrowded train to get to work is pretty normal for commuters in Beijing. When I got off the train, I still needed to take the shuttle for half an hour in a total traffic jam to get to the office.
Although life in Beijing was tough, I felt refreshed as soon as I walked into the campus of Lenovo’s brand new headquarters. The complex just look like the campuses of those tech giants in Silicon Valley. It is neat, massive, modern and comfortable. There are everything we need for both work and life on campus, including a library, gyms, sports facilities like basketball and badminton courts, restaurants, cafes, convenient shops, clinics, and even laundry rooms. I felt that people could literally live there (or are those facilities well prepared for working overtime?)
Part II: Linguistics Team
I started my job as an intern at the linguistic team, a sub-unit of the NLP team of AI lab at Lenovo Research. The NLP team had developed English chatbots (robots who can chat with customers, like Siri) for Motorola and Lenovo’s customer services. When customers have a question about Lenovo or Motorola products, they can just talk to the chat bots instead of human representatives. Motorola’s chatbot is called Moli. Lenovo’s chatbot is called Lena.
My job was to analyze the semantics according to the Rhetorical Structure Theory, and to manually classify each speech and intent of 200 chat logs between customers and Moli every day. When I started doing it, I made almost the biggest mistake during my whole internship. Usually, when interns arrive, they will spend the morning to read the guidelines to get familiar with every categories, and spend the afternoon to do their classification work. However, when I arrived on the first day, the office was short of working spaces and my supervisors were busy finding me a place to sit and work. In the afternoon, I was busy trying to help my supervisors with some Python works. Those unexpected challenges left me no time to read the guidelines. I had to do the classification work while learning the guidelines simultaneously at the end of the day. On the next day, my supervisors told me that the accuracy of my work was too low, returned my work to me and asked me to redo it. I understood that the problem was that I was not familiar with the guidelines, so I used my spare time to memorize the guidelines and got back to the classification work.
For this work, I used a lot of knowledge of syntax, and learned a great amount of domain knowledge of electronic devices. However, I knew that I only had a tiny bite of NLP, and I wanted to know more about it. A month later, while I was having lunch with my supervisors, they mentioned that they interviewed two applicants for an intern position. They really wanted to take both of them in, but they only had one head count left then. I politely proposed that if they agree to transfer me to another team, they could have one more head count. To my surprise, they approved and let Natural Language Understanding team borrowed me. Since then, I became someone both in NLU team and Linguistics team, and found that the work in NLU team was way more challenging and fun.
Part III: NLU team
Unlike Linguistics team, which supports other teams by doing the manual classification work, Natural Language Understanding team develops algorithms for chatbots to understand customers’ languages. The interns in Linguistics team are language geniuses who are undergraduates studying foreign languages, but the interns in NLU team are programming experts who are graduate students studying machine learning. I felt astonished that I was the only undergrad there. My supervisors agreed to take me in because I could speak Japanese, could code with Python and was already familiar with the classification rules thanks to my previous work in Linguistics team. Back then, Lenovo had already developed English Lena, and they want to develop Japanese Lena by translating English into Japanese. Although the business related contents are translatable, the social talk in each language are very untranslatable, because different languages have different cultures behind. Another problem was that they did not have any data to train the chatbot because the Japanese chatbot was not released yet. There was no chat logs between the chatbot and the customers. Therefore, they needed somebody who knows Japanese and coding, and is familiar with classification rules to build some training data for the chatbot. And I was the only one who could do that.
When I started doing it, the only data I had were some chat logs translated from the English to Japanese, and some chat logs between the human representatives and the Japanese customers. The first problem I had was how to parse the Japanese sentences to catch the intents behind. The Japanese writing systems are the most complicated among all the languages in the world. It consists of characters of 3 different writing systems without a clear signs of the start and the end of word (English uses blank space for that). Thus, I needed to find a right way to tokenize the Japanese words to analyze the semantics. I used the Python package Janome to parse the sentences into morphemes, and then extracted the important morphemes to build a key-word-rule look-up table. In short, I set the rules for Lena and made the training data for it, so that in the future Lena can learn the Japanese social talk that by itself. After a number of tuning and testing, I enhanced the artificial intelligence chatbot’s understanding rate of Japanese social talk from 0 to 70%. It was my greatest achievement throughout the whole internship, also the first time I worked without a clear instruction. Because, I was the one who started doing it, and the only one who could do it.
Besides the work in NLU team, I was also helping Linguistics team. My supervisor was training the colleagues in Japan to do the manual classification. I provided advices for the Japanese by correcting their answers and used slides to give them feedback.
Spin-off: Besides Work
Now let me talk about the company. Although Lenovo is a company originated in China, the culture of this company is actually really international. The employees are culturally very diverse. Everyone wears casual outfits to work. As an intern there, I could even participate team-building event at the Great Wall.
When I arrived, I happened to participated in AI Lab’s celebration of its 2-year Anniversary. I was totally awed by its achievements. According to the number of patents, Lenovo is ranked as №3 in China and №19 in the world in the field of NLP. Everyone in NLP are graduated with Master’s or PHD degrees from renowned universities. The team leaders and the managers are the top scientist in this field with doctorates. Even though Lenovo's major business is hardware, Lenovo is spending a lot of money, bringing in a lot of talents into the development of the artificial intelligence, hoping to lead in this technological revolution.
I felt so lucky that I met so many great people there, who are the most intelligent and the best at their work. These people include my managers, supervisors, who patiently taught me to do my work and pardoned my mistakes with no complaints. I remember one interesting thing is that my supervisor in NLU team was the VP of the Toastmaster club, a club where member can practice and compete their public speech skills. One day she took me to her club, I surprisingly won the first-place even if it was my first time there. But most importantly, I befriended other interns and continued this friendship afterwards. Been studying in US for 3.5 years, this was my first time to know the lives of Chinese undergrad students. Surrounded by these people, I felt that my days were filled with joy even the work was sometimes stressful.
In July 2019, because I had to prepare for my grad school, I ended my 5-month full-time internship at Lenovo with a lot of great memories, not just for the work per se, but for everything. I may not choose to become a NLP developer in future because I do not have enough machine learning development experience, but I was so glad that in my career life, I still did something for my love for linguistics, for my original passion, for my 初心. Although I missed a lot of campus life for graduating earlier than my peers, this internship filled my last year of college with incomparable experience. Lenovo, Love On.