Research

Understanding and preserving Baybayin

Early this year, three scientists from the Institute of Mathematics (IM) of the UP Diliman (UPD) College of Science (CS) made national headlines with their study, Block-level Optical Character Recognition [OCR] System for Automatic Transliterations of Baybayin Texts and Using Support Vector Machine (Block-level OCR System).

Pino. Photo from Pino

Rodney Pino, Renier Mendoza, PhD, and Rachelle Sambayan, PhD, developed an algorithm to convert or translate an entire paragraph of Baybayin scripts into Latin characters using a support vector machine (SVM). 

Baybayin is an old Tagalog writing system primarily used in the northern Philippines during the pre-Hispanic period. Meanwhile, the SVM is a cutting-edge machine that can categorize hand or typewritten characters. 

Block-level OCR System claims to be the first OCR study that can classify Baybayin at the block or paragraph level.

In an email correspondence with UPDate Online, Pino explained that his interest in studying Baybayin was influenced by Mendoza, his master thesis adviser at the IM. 

“He was the one who discussed with me and revealed his intention to develop a Baybayin OCR for my MS thesis. He then invited Sambayan, a specialist in SVM to work with the Baybayin OCR,” said Pino.

Pino’s master thesis is Baybayin Optical Recognition System Using Support Vector Machine. He finished his Master of Science in Applied Mathematics in 2021 and was named most outstanding Master of Science graduate by the CS.

Pino said he and his team chose to conduct the study because, “Baybayin and other old scripts are living proof that our nation has a unique systematic manner of writing. The one we can consider our own writing system and feel proud of and humble about as Filipinos.”

According to Pino, Baybayin OCR is still in its infancy stage unlike other highly-developed OCR for other scripts such as the Roman alphabet and the Han scripts, among others.

A UPD bicycle lane with Baybayin scripts. Photo by Bino Gamba, UPDIO

“Fortunately, the SVM method we tried out for the Baybayin script worked and has a good recognition rate when classifying or identifying Baybayin characters,” he said.

Pino added that technology has played an important role in the ongoing restoration of the script. Mobile keyboards such as Gboards have indigenous scripts included in their keyboard settings. The Baybayin scripts can now be typed into word processing. 

Currently, however, “There are none that translate or transliterate Baybayin manuscripts into comprehensible Latin or Roman text,” he said.

Still, with the technology that Pino and his team developed, the Baybayin writing system can be conveniently learned, read, and written. 

Pino said he and his team’s study supports House Bill 1022, the bill that recognizes Baybayin as a national writing system.

Looking forward, Pino and his team hope to develop an application that can function as a two-way text converter from Baybayin to Latin and vice versa, similar to Google Lens’ translate feature.