FSBC: Fast String-Based Clustering for HT-SELEX Data

What is FSBC?

Fast String-Based Clustering (FSBC) was designed to estimate clusters by searching various lengths of over-represented strings as target binding regions. FSBC was also designed for fast calculation with search space reduction from a single round, typically the final round, of HT-SELEX data considering imbalanced nucleobases of the aptamer selection process. The calculation time and clustering accuracy of FSBC were compared with those of four conventional clustering methods, FASTAptamer, AptaCluster, APTANI, and AptaTRACE, using HT-SELEX data (>15 million oligonucleotide sequences). FSBC, AptaCluster, and AptaTRACE could complete the clustering for all sequence data, and FSBC and AptaTRACE performed higher clustering accuracy. FSBC showed the highest clustering accuracy and had the second fastest calculation speed among all methods compared.

Download

FSBC (R) [1]
fsbc_0.1.0.tar.gz (369,976 B) Download
If you use this program, please cite the reference [1].
pFSBC (Python) [2,3]
pfsbc_python_0.1.0.tar.gz (17,246 B) Download
If you use this program, please cite the references [2,3].

References

[1] Shintaro Kato, Takayoshi Ono, Hirotaka Minagawa, Katsunori Horii, Ikuo Shiratori, Iwao Waga, Koichi Ito & Takafumi Aoki, "FSBC: Fast string-based clustering for HT-SELEX data," BMC Bioinformatics, vol. 21, no. 263, pp. 1-10, June 2020.
Open access
[2] Takayoshi Ono, Shintaro Kato, Koichi Ito, Hirotaka Minagawa, Katsunori Horii, Ikuo Shiratori, Iwao Waga, and Takafumi Aoki, "Pararell implementation of motif-based clustering for HT-SELEX dataset," Proc. IEEE Int'l Conf. Bioinformatics and Bioengineering, pp. 50--55, October 2019 (BIBE2019 Best Student Paper Awards).
IEEE Xplore
[3] Shintaro Kato, Takayoshi Ono, Masaki Ito, Koichi Ito, Hirotaka Minagawa, Katsunori Horii, Ikuo Shiratori, Iwao Waga, and Takafumi Aoki, "Parallel implementation of string-based clustering for HT-SELEX data," EAI Endorsed Trans. Bioengineering and Bioinformatics, no. e4, pp. 1--11, October 2020.
Open access.

Contact

Shintaro Kato (NEC Solution Innovators, Japan/Graduate School of Information Sciences, Tohoku University, Japan)
katou-s-mxn@nec.com

FSBC: Fast String-Based Clustering for HT-SELEX Data

What is FSBC?

Download

FSBC (R) [1]

pFSBC (Python) [2,3]

References

Contact