मंगलवार, 25 फ़रवरी 2014

DU Undergrads Develop App For Offbeat Languages (सिलिकॉन इंडिया से साभार)

Ever wondered if you could speak in Ladakhi and Mao Naga? The Indic Language application that is being developed by a few enthusiastic students from the Delhi University’s Cluster Innovation Centre will allow you to do so. It is being put together by a group of four undergraduate students including Himanshu Patel, Vivek Shekhar, Leelambar Soren and is lead by Vikalp Kumar.

Vikalp Kumar, 21, developed a liking for languages when he was able to impress his friends by writing their names in different languages. He is basically from Chennai and knows 5 more languages than the usual suspects Tamil, English and Hindi. The other languages that he is proficient in are Telugu, Kannada, Urdu, Punjabi and Sarazi –which is spoken in a district of Kashmir. He has an understanding of Sanskrit and Persian as well.

Under the guidance of Sukrita Paul Kumar, Coordinator of BTech in Humanities in DU, the group started by doing spadework and followed it up with a questionnaire. It took half a year for them to accomplish this preparatory work. Sukrita, who was an editor for People's Linguistic Survey of India knew just how big an undertaking this project was. Though this sort of exercise would claim generous funding and involves daunting field visits, the team found ways around both.

Vikalp said that,"There are speakers of 80 northeastern languages in Delhi." The questionnaire which contains over 2,600 English word and phrases in English is circulated among native speakers of a language to find the closest equivalents for it.

In September 2013, in what Vikas calls as "rapid vocabulary collection workshop" in about four hours, 2,500 words in Ladakhi were collected and recorded. Speakers of Dhatki, from Sindh region of Pakistan were traced at the South Asian University in Delhi and they took part in the exercise.

Facebook helped Vikalp cross borders and he contacted a speaker of Khowar through the social network. Email, instant messenger and Whatsapp were also put to use to collect words.

The app is not just another online dictionary. It is having songs, subtitled videos and indicates the geographical spread of a given language. Patel and Shekar worked on geography, culture and politics part of the project and tech team comprised of Patel and Soren. Kumar took the overall responsibility.

Kumar said, "We can go public when we have about five languages." Sukrita is also considering letting future batches of students pick up where the current batch leaves off, enhancing the number of languages in the app.

सिलिकॉन इंडिया से साभार

शनिवार, 1 फ़रवरी 2014

Veteran wordsmith who created world’s first Hindi thesaurus (हिंदुस्तान टाइम्स से साभार)

Arvind Kumar is India’s answer to Peter Mark Roget. Following in the footsteps of the British physician and lexicographer, who published the Thesaurus of English Words and Phrases (Roget’s Thesaurus) in 1852, contributing the understanding of English language, Kumar has created the ‘Samantar Kosh’ — India’s first-ever modern Hindi thesaurus. It’s online version is called ‘Arvind lexicon’.

At 84, Kumar is continuously working to expand his lexicon of words from Hindi, helping millions to improve and learn India’s national language.

In 1952, then 22-years-old Kumar had been shifted from a Hindi magazine to an English publication of the company he had worked for first as a cashier, then typesetter, proof reader and ultimately a sub-editor. The shift in the medium of expression at his job meant that Kumar was always looking for the appropriate expressions in English. To solve his problem, Kumar bought his first edition of Roget’s Thesaurus.

The moment he held it in his hands, he wished if somebody could write a similar collection in Hindi.

And Kumar embarked upon the project — something nobody had attempted before — 20 years later.

The work began in earnest in 1978. Kumar was sure that he would able to complete the work in two years by following the pattern of Roget’s Thesaurus. He decided to write down words on small ruled cards, assigned numbers to concepts and topics and put the numbered cards in the Rogetian sequence. All he needed to do now, he thought, was fill the cards with appropriate Hindi words.

“In fact, I even went to the extent of noting down all his heading and entry numbers in a register and write possible Hindi head words for them and made cards accordingly. All I would have to do was to fill the cards with Hindi synonyms. I was sure that walking in Roget’s footsteps, I would reach my goal pretty fast and be able to present my magnum opus to Hindi and India — within two years,” he said.

But a few weeks into his venture, Kumar realised that it was not going to be that easy. He found that several concepts in Hindi, such as Brahmins, Banias, had no parallel in Roget. “Roget had organised his data on the basis of the so-called scientific classification, not on the mental associations of a human being. For people like us wheat is an edible cereal, banana an edible fruit. To a scientist, both are grasses,” said Kumar.

In order to improve upon Roget’s system, Kumar ignored all the numbers assigned to the cards and reorganized them in subject-wise groups. “I was sure a new associational system would emerge. Fortunately nobody had agreed to finance us. Thus there was nobody to chide us. Nobody to seek any explanation from us: Why was the project taking such an inordinate time! We were on our own.”

In 1994, he shifted Mumbai to Delhi. By this time he had 60,000 cards with over 250,000 hand-written expressions. His son, a medical surgeon, helped his father in digitising the data and wrote a programme to structure it. An operator took about ten months to feed the data.

Finally, in December 1996, Samantar Kosh — India’s first-ever modern and vast Hindi thesaurus, containing 1,60,850 expressions grouped in 1,100 categories and 23,759 sub-categories was published by the National Book Trust, India, a Government of India undertaking.

It was made part of the celebrations of the golden jubilee of Indian Independence the next year by the publisher. The book has already seen a sixth re-run in 2012. In fact during the past two decades that he spent on the project, Kumar discovered various interesting aspects such as there are as many as 2,534 synonyms for names of Lord Shiva.

In 1990, his daughter, Meeta Lal, a nutritionist, stressed the need for a bilingual English-Hindi thesaurus and kickstarted the project by jotting down English equivalents for all its Hindi headwords in Samantar Kosh. “The product was world’s largest bilingual thesaurus in three-part (3,200 pages weighing five kgs), “says Lal.

In 2011, his thesaurus’ online version — Arvind Lexicon was launched.

It also has an android version. Everyday, this wordsmith, rises at 5 am, works till 6.30pm at his house in Chandra Nagar, in east Delhi, revising the data and improving it and adding more concepts.

“The biggest compliment for my work was when a Hindi weekly ‘Nutan Savera’ called Samantar Kosh as ‘Hindi ke Maathe par Sunehri Bindi - A golden dot on Hindi’s forehead. And, when a reader called it the Best Book of the Century. Now, my only wish is to remain active till my last breath.”