Recognize people by writing style

I’ve seen people make ML models that create vector embeddings of faces and voices for the purpose of automated recognition.
Are there such algorithms that do the same for text inputs? I don’t mean sentiment analysis or information extraction or genre categorization; I mean representations of an authors writing style.

I looked around already, but tell me if this is the wrong subreddit for this.

My knowledge of the space is old (pre-embeddings) but there was an area of research called author identification. If I remember right, the Federalist papers were a common dataset because not all the authors are known.

One challenge is that many things correlate with author, like genre or dates. It’s often easier for ML methods to pick up on those things rather than some sense of the author’s writing style.

Hope this gives you some things to google

You could train/finetune such an embedding model if you have a large corpus of documents of different authors (with several documents per author) and using a contrastive loss.

I am unsure how well that generalizes to unseen authors but it could work if you have labelled documents of the target source.

Of course this would not be able to identify the same author if some are e.g. legal documents and the others are diary entries. But it may be worth a try.