Youtube Transcript Summarizer with Ai Chatbot
Keywords:
TRS PCI, TMS, TRS, ANNAbstract
The YouTube Transcript Translation Web Application is a tool designed to extract the transcript from YouTube videos and translate it into multiple languages. This Flask-based application uses the YouTubeTranscriptApi to fetch the transcript of a given YouTube video and Deep Translator to translate the extracted text into various languages. By dividing the transcript into smaller pieces and translating each one separately, the system addresses the common problem of text in long transcripts exceeding query length limits. The field of text summarization has seen significant advancements, primarily due to progress in NLP and machine learning. Techniques range from extractive approaches, where key sentences are selected directly from the text, to abstractive methods, which generate summaries by paraphrasing the content. Tools such as BERT, GPT, and Transformer-based architectures have revolutionized summarization tasks. Previous studies have also explored video content summarization, focusing on either visual elements or transcripts. YouTube provides autogenerated transcripts for many videos, but these are often unstructured and verbose. Existing solutions for transcript summarization, such as manual editing or generic text summarizers, are time-intensive and lack context sensitivity for video-specific nuances. Current NLP-based tools may not integrate seamlessly with YouTube’s API or fail to account for timestamped content, which is critical for maintaining the structure of video narratives. The proposed YouTube Transcript Summarizer was evaluated using a dataset of transcripts from various genres, including educational videos, podcasts, and tutorials. Metrics such as ROUGE scores and user satisfaction surveys were employed to assess the system's performance. The results demonstrated an average ROUGE-1 score of 85%, indicating a high level of accuracy in retaining critical information
Downloads
Metrics
References
Jaiswal, Shubhangi, and Manoj Misra. "Automatic indexing of lecture videos using syntactic similarity measures."2018 5th International Conference on Signal Processing and Integrated Networks (SPIN). IEEE, 2018.
Pradeep Choudhary, Sowmya P. Munukutla, K. S. Rajesh, Alok S. Shukla “Real time video summarization on mobile platform” International Conference on Multimedia and Expo (ICME), 2017 IEEE
Rajkumar Kannan, Gheorghita Ghinea, Sridhar Swaminathan, Suresh Kannaiyan “Improving video summarization based on user preferences” 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)
Jayanta Basak, Varun Luthra and Santanu Chaudhury “Video Summarization with Supervised Learning” 2008 IEEE.
Wei REN Yuesheng ZHU “A Video Summarization Approach based on Machine Learning” International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2008 IEEE
Taskiran, Cuneyt M., et al. "Automated video summarization using speech transcripts." Storage and Retrieval for Media Databases 2002. Vol. 4676. International Society for Optics and Photonics, 2001.
Rohit Anand, Gulshan Shrivastava, Sachin Gupta, Sheng- Lung Peng, Nidhi Sindhwani “ Audio Watermarking With Reduced Number of Random Samples” In Handbook of Research on Network Forensics and Analysis Techniques (pp. 372-394). IGI Global.
Garima Bakshi, Rati Shukla, Vikash Yadav, Aman Dahiya, Rohit Anand, Nidhi Sindhwani and Harinder Singh “An Optimized Approach for Feature Extraction in Multi-Relational Statistical Learning” Journal of Scientific and Industrial Research (JSIR).
W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao, ‘‘Shape robust text detection with progressive scale expansion network,’’ in Proc.
IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 9336–9345.
T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, ‘‘Feature pyramid networks for object detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 2117–2125.
P. Yang, G. Yang, X. Gong, P. Wu, X. Han, J. Wu, and C. Chen, ‘‘Instance segmentation network with self-distillation for scene text detection,’’ IEEE Access, vol. 8, pp. 45825–45836, 2020.
Y. Sun, J. Liu, W. Liu, J. Han, E. Ding, and J. Liu, ‘‘Chinese street view text: Large-scale chinese text reading with partially supervised learning,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 9086–9095.
B. U. Kota, K. Davila, A. Stone, S. Setlur, and V. Govindaraju, ‘‘Automated detection of handwritten whiteboard content in lecture videos for summarization,’’ in Proc. 16th Int. Conf. Frontiers Handwriting Recognit. (ICFHR), Aug. 2018, pp. 19–24.
B. U. Kota, K. Davila, A. Stone, S. Setlur, and V. Govindaraju, ‘‘Generalized framework for summarization of fixed-camera lecture videos bydetecting and binarizing handwritten content,’’ Int. J. Document Anal.Recognit., vol. 22, no. 3, pp. 221–233, 2019.T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and
S. Belongie, ‘‘Feature pyramid networks for object detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 2117–2125.
P. Yang, G. Yang, X. Gong, P. Wu, X. Han, J. Wu, and C. Chen, ‘‘Instance segmentation network with self-distillation for scene text detection,’’ IEEE Access, vol. 8, pp. 45825–45836, 2020.
Y. Sun, J. Liu, W. Liu, J. Han, E. Ding, and J. Liu, ‘‘Chinese street view text: Large-scale chinese text reading with partially supervised learning,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 9086–9095.
M. R. Rahman, S. Shah, and J. Subhlok, ‘‘Visual summarization of lecture video segments for enhanced navigation,’’ in Proc. IEEE Int. Symp. Multimedia (ISM), Dec. 2020, pp. 154–157.
M. Husain and S. M. Meena, ‘‘Multimodal fusion of speech and text using semi-supervised LDA for indexing lecture videos,’’ in Proc. Nat. Conf. Commun. (NCC), Feb. 2019, pp. 1– 6.
P. Banerjee, U. Bhattacharya, and B. B. Chaudhuri, ‘‘Automatic detection of handwritten texts from video frames of lectures,’’ in Proc. 14th Int. Conf. Frontiers Handwriting Recognit., Sep. 2014, pp. 627–632.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

