How to use text mining to quantify the evolution of a topic over time.
Reddit » Text & Data Mining
by /u/Cerricola
1w ago
Good evening, I’m currently self-teaching text mining and I’m interested in exploring techniques to measure the progression of topics over time. Let’s assume that the topics aren’t predefined, which means we need to construct them using methods like LDA, SVD, or BERTopic. The challenge is to analyze how these topics change over time. While one approach is to conduct topic modeling at separate intervals, I’m seeking a more continuous method. Any insights on how this can be achieved would be greatly appreciated. My aim is to build an index to quantify how a certain topic evolves overtime. subm ..read more
Visit website
Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
Reddit » Text & Data Mining
by /u/kakakak241
2w ago
I am developing a project that involves processing text data. My goal is to correct errors specifically related to unnecessary characters and spaces in texts. I'm looking for recommendations on suitable Python libraries and tools that could help address these issues. Extraneous spaces: Correct: "We boug ht a new car yesterday." to "We bought a new car yesterday." Correct: "Today was a ve ry goo d da y." to "Today was a very good day." Correct: "Hel lo! Ho w are you do ing?" to "Hello! How are you doing?" I have explored several existing solutions, but most of them were either too basic for ..read more
Visit website
Pre register for News API for free access
Reddit » Text & Data Mining
by /u/Far-Amphibian3043
1M ago
submitted by /u/Far-Amphibian3043 [visit reddit] [comments ..read more
Visit website
Possible NLP that detects AI text
Reddit » Text & Data Mining
by /u/gckoch
1M ago
"Authorship Fingerprinting research is capable to correctly distinguish the works created by GPT 3.5, GPT 4, and human authors with recall rate 98.84% in our preliminary study." - Maiga Chang One hour technical online (free) Thu Feb 29 "Challenges in Natural Language Processing Applications" submitted by /u/gckoch [visit reddit] [comments ..read more
Visit website
No code LLM + Knowledge graph powered data extraction platform
Reddit » Text & Data Mining
by /u/charles-legislate
2M ago
submitted by /u/charles-legislate [visit reddit] [comments ..read more
Visit website
Help with understanding Latent Dirichlet Allocation (LDA)
Reddit » Text & Data Mining
by /u/Cerricola
2M ago
Good evening, I need help with understanding the maths behind the LDA model: https://ai.stanford.edu/~ang/papers/jair03-lda.pdf Despite I understand the intuition of what is the model doing, for me is like a black box submitted by /u/Cerricola [visit reddit] [comments ..read more
Visit website
Need Help Determining an Approach for a Project - Theme classification from movie reviews + rating prediction
Reddit » Text & Data Mining
by /u/Bubbles734
2M ago
Hello everyone! I'm very new to this field and have been tasked with a project where by looking at movie/tv show reviews of east asian and north American media I need to identify some themes that differentiate the two types of media. For example. Let's say I'm analyzing "Parasite" (Asian media) and "Breaking Bad" (North American media). After processing the reviews: I might find that "Parasite" reviews frequently discuss themes of class disparity and societal structure, while "Breaking Bad" reviews often touch on themes of morality and personal choice. So I need to classify the reviews based ..read more
Visit website
How do i create a dataset for metaphor detection
Reddit » Text & Data Mining
by /u/am_kolade
3M ago
Hello, I'm new here. I'm an undergraduate student who is about to start a project that requires me to create a dataset for a model. This model that detects metaphors that are present in the English comprehension passages from a particular exam body. please i need guidance, i'm willing to work and learn. I just need someone that knows more than me and can put me through so I won't keep wasting time. submitted by /u/am_kolade [visit reddit] [comments ..read more
Visit website
Need Help with open source project dealing with NLP and LLM
Reddit » Text & Data Mining
by /u/rrtucci
4M ago
My open source software SentenceAx is a fine tuning of BERT for splitting complicated sentences into simple ones. After 500 commits, it is thoroughly debugged on a CPU for small values for everything. Now I need someone with a GPU (I don't have one) to volunteer to train it for me. I don't know how long it will take but probably just a few hours. This is a fairly close rewrite/improvement of the famous software Openie6, so this model and hyperparams have been used successfully before to train Openie6. If you decide to accept, Here is the repo. SentenceAx is a stand alone component of the Mapp ..read more
Visit website
TDM help…am I missing something?
Reddit » Text & Data Mining
by /u/Mental_Bet6033
5M ago
Looking to do a web-scraping project for a class, specifically on US newspaper article data. Most of the APIs are pretty expensive and outside my budget. Is there a way to do web-scraping on an academic database like Lexus Nexus? Would make me life a whole lot easier. Thanks everyone! submitted by /u/Mental_Bet6033 [visit reddit] [comments ..read more
Visit website

Follow Reddit » Text & Data Mining on FeedSpot

Continue with Google
Continue with Apple
OR