DeepSync: 10x Audio Content production by voice syncing

You are currently viewing DeepSync: 10x Audio Content production by voice syncing

It needs intelligence to create intelligence. 

Ishan Sharma and Rishikesh — two tech enthusiasts built an amazing Artificial Intelligence based technology that clones voices and speaks like the original voice. This technology is revolutionizing audio content production, giving a new shape to the future of audio driven content such as podcasts.

founder of DeepSync

Both of them together started Deepsync Technologies, a tech company in December 2018, which offers this voice syncing application to the world. Now, producing audio content is much cheaper and 10x faster and saves 90% of time and effort.

We interviewed Ishan Sharma, the co-founder of Deepsync Technologies, to discover more about their startup journey and how they are making Deepsync a reality from just an idea. Their story is a big motivation to budding tech entrepreneurs.

About Ishan Sharma:

Ishan is an engineer from SRM University, Chennai. He has spent most of his time in building technology of some sort or another. After spending time with Startups in Bangalore, he spent a great deal of time in Berlin, working in deep tech driven applications.

In December 2018, he started Deepsync with his co-founder Rishikesh, who is an AI Engineer and has spent the last 5 years working in Deep Learning.

Interview

deepsync technologies

Q1. Deepsync is an AI Based technology, hence, we would like to hear from you; what is artificial intelligence?

Artificial Intelligence is simply the creation and simulation of Intelligent behavior in a machine.

What makes AI today more accessible is the abundance of data, computation (in forms of GPUs/TPUs) and creation of algorithms based on the human brain.

In principle, Intelligence by no means is a biological operation and could be created on any substrate, given the right form of engineering, which we’re learning at a rapid pace. Thanks in part to neuroscience and engineering.

In the coming decades, we should see AI permeating every facet of human civilization, from robots taking our household jobs to algorithms shaping economic policies and designing spaceships for Interstellar missions efficiently. This is the right time for us to educate ourselves and others to take part in what would be eventually the greatest leap forward in the history of our species. (Given, we don’t ruin it first).

Q2. I came to know, you met your co-founder on GitHub. I think it’s different and interesting. Please, tell us more about it?

We connected on GitHub when we were both working on similar side-projects while doing our jobs.

Finding that we both were fascinated by the aspect of AI generated audio content and had similar backgrounds in Engineering. We connected quickly and decided to build this company together. We shortly quit our jobs for starting our startup and moved to Bangalore to begin the work.

Q3. How do you describe the role of co-founder in any startup journey? And, how to build a strong and unbreakable bond among co-founders?

The roles and responsibilities arise naturally from the interest of what the particular is good at.

Most startups fail because the founders don’t connect properly, even if they are old friends. It’s just the way this world operates. But, we have found that by giving space to each other in forming opinions and focusing only on what is right and not who is right, you can start to build respect for each other’s thinking.

At the end of the day, you’re fighting for the same cause and are on the same boat. This is why it’s good to not jump to conclusions and always discuss in areas where you believe others’ views would help.

Q4. Now, let’s get on Deepsync! What is Deepsync? Which area is supposed to get the most benefit from it?

Deepsync is an audio content production technology. We are focused on solving the costs and time associated with Audio production for a creator.

The Industries we are focusing on are audio-first content such as podcasting and audiobooks. We want to empower creators to produce High-Quality content in their own voice but without the hassle of manual production thus reducing the time to record, to do retakes, to edit and to do post-production.

The creator is still responsible for writing/researching the content and overseeing the output, but the main time-taking and costly part can now be augmenting using our technology.

Q5. Why should an industry use Deepsync over other machine generated voice (Amazon and Google AI)?

Rather than building a library of ready-made voices such as Amazon or Google audio content, we offer the creators/companies to sync their own voice. This is due to the fact that most audio content and podcasting is creator-first (their brand matters) and they are the one who face the cost/time barrier of scaling their audio needs.

From our survey, it can take up to 6.2 hours of work to produce just 1 hour of output and that is only by professional creators.

frustation
Source: Giphy

For non-professionals, this isn’t affordable and thus the quality is usually not that great, limiting the potential for content creation. By syncing their own voice, they now have the superpower of producing audio faster and economically while retaining their own brand with their voice.

Q6. Creating voice content with Deepsync is fast, cost-effective and personalized. Realizing this, how can an interested industry get started with Deepsync?

To start with Deepsync, a creator or a company can sign up on our platform at Deepsync.co or schedule a demo directly with our team.

We will walk them through our product and begin the process of syncing their voice with us. Once synced, they can produce audio content in real-time i.e. 1 hour of audio takes only 20 minutes to produce in a Studio like Quality, thus saving enormous amounts of time and cost associated with production. This eliminates the need for retakes, editing or any form of post-production.

Q7. What message do you have for the budding entrepreneurs?

If you’re trying to create a new venture, almost nobody can predict the output.

It’s all on your effort and your decisions that your company will turn out to be a success or not. Although, there are extreme external factors involved, most of the time, it’s all on you. So invest time in yourself, understand your market, define your product, learn from your customers but most of all, don’t give up even though you feel like it.

Lastly, pick a business only if you’re passionate about building one as it can get pretty rough pretty quickly.

Q8. Ideas are very crucial in every startup. What is your way to get more ideas?

Although this was partially true for us; try to figure out your interests and see if you face some problems that are also faced by many others. Many great companies were formed out of such situations. If that doesn’t work, look for what people are trying to accomplish, what is the job they are trying to do and if your service/product can make a big difference in helping them achieve their goals.

Q9. How do you see Deepsync in the coming years [2020-2023]?

As Deepsync Technologies succeed in helping creators produce audio content much more quickly and efficiently, then it makes sense to double down on that and improve the way of audio content creation.

We are a research-first company so we’ll be always trying to innovate and find new ways to bring value to our customers.

Q10. Here is one of your Tweets. I think it’s some kind of message! Please tell, what does it mean?

There is an idea (and now a field in AI research) which is trying to build friendly AI. We often hear researchers and engineers (even philosophers) claiming that all we need to do to make AI bend to our needs is to transfer our value judgments such as morality (what is good vs bad) and ethics (see the trolley problem). And, then we would have created the perfect machine that will lead us into some sort of Utopia. It is a likely a lost cause for many reasons:

  • First, It’s very hard for us to codify a goal that we want to achieve and take into account all the factors and permutations that might affect it in the wrong direction. For Ex: Let’s say we ask AI to make us happy. The AI (being enormously intelligent) decides that the best way to make us all happy is to stick electrodes in our brains and run a simulation for eternal pleasure. From the point of view of AI, it’s correct but that’s not what we want. This idea of what we want is extremely hard to codify because the AI would need to have a human perspective to achieve so. The AI has to learn what we want.
  • Second, even if we  somehow make the AI learn what we truly want, how do you decide what is ‘True’ given we humans have been debating about that since the dawn of time? Do you really want to give this extremely powerful system an Ideology that is not universal when there is hardly such a thing as universal values? Even if we leave the AI to learn it’s own values, it might become pretty clear to this AI that we humans are not really that morally driven. How would it react?
  • I am not saying that there are no universal values. I believe every religious person, every citizen, even every child believes that we should eradicate unnecessary suffering – such as access to food, water, good education, good health, equal opportunities, freedom to express oneself, etc. But to create this utopia with a powerful technology like AI (many people will underestimate the power), we need a global collaboration, which as of now looks like a lost cause.

This Post Has 3 Comments

  1. Praful

    This can be a handy tool to have but developers need to be cautious as it can be used for vile purposes as well.

  2. Michelle

    This is super cool! The way the voice sounds exactly the same as the original is quite scary. Don’t you think there’s a lot of potential for using this type of technology in a destructive way, i.e. blackmailing someone with something they never actually said? Just a thought.

    All the best, Michelle (michellesclutterbox.com)

  3. SnowBoy

    Cool verte nice

Leave a Reply