TextStatistics - Get a text file wich contains English text

Overview

TextStatistics

This program get a text file wich contains English text. The program analyses the text, and print some information. For this program I used the NLTK library of Python, for analyze the natural language. For find the characters names I analyzed the POS of the sentences.

For example, I tested the program on this text file: https://github.com/kzjeef/algs4/blob/master/burrowswheelers/testfile/dickens.txt

The results:

  1. Lines count: 84583

  2. Words count: 760339

  3. Unique words count: 9206

4.1. Words AVG in sentence: 8.989264982325054

4.2. Longest sentence:

When I first peeped in at
the gate of the lifeless quadrangle, and started from the
mouldering statue becoming visible to me like its guardian ghost;
when I stole round by the back of the farm-house, and got in among
the ancient rooms, many of them with their floors and ceilings
falling, the beams and rafters hanging dangerously down, the
plaster dropping as I trod, the oaken panels stripped away, the
windows half walled up, half broken; when I discovered a gallery
commanding the old kitchen, and looked down between balustrades
upon a massive old table and benches, fearing to see I know not
what dead-alive creatures come in and seat themselves, and look up
with I know not what dreadful eyes, or lack of eyes, at me; when
all over the house I was awed by gaps and chinks where the sky
stared sorrowfully at me, where the birds passed, and the ivy
rustled, and the stains of winter weather blotched the rotten
floors; when down at the bottom of dark pits of staircase, into
which the stairs had sunk, green leaves trembled, butterflies
fluttered, and bees hummed in and out through the broken door-ways;
when encircling the whole ruin were sweet scents, and sights of
fresh green growth, and ever-renewing life, that I had never
dreamed of, - I say, when I passed into such clouded perception of
these things as my dark soul could compass, what did I know then of
Hoghton Towers?

5.1. Most popular word: [('the', 38410)]

5.2. Most popular word exclude stop words: [('said', 4253)]

  1. Longest word sequence without 'k' words:

    herself, would have been prostration, and that she would have fallen, in point of fact, a victim.'

    'Now, Dombey! - ' says the Major, resuming his discourse with great energy.

    'I beg your pardon,' interposes Cousin Feenix. 'Allow me another word. My friend Dombey will permit me to say, that if any circumstance could have added to the most infernal state of pain in which I find myself on this occasion, it would be the natural amazement of the world at my lovely and accomplished relative (as I must still beg leave to call her) being supposed to have so committed herself with a person - man with white teeth, in point of fact - of very inferior station to her husband. But while I must, rather peremptorily, request my friend Dombey not to criminate my lovely and accomplished relative until her criminality is perfectly established, I beg to assure my friend Dombey that the family I represent, and which is now almost extinct (devilish sad reflection for a man), will interpose no obstacle in his way, and will be happy to assent to any honourable course of proceeding, with a view to the future, that he may point out. I trust my friend Dombey will give me credit for the intentions by which I am animated in this very melancholy affair, and - a - in point of fact, I am not aware that I need trouble my friend Dombey with any further observations.'

    Mr Dombey bows, without raising his eyes, and is silent.

    'Now, Dombey,' says the Major, 'our friend Feenix having, with an amount of eloquence that Old Joe B. has never heard surpassed - no, by the Lord, Sir! never!' - says the Major, very blue, indeed, and grasping his cane in the middle - 'stated the case as regards the lady, I shall presume upon our friendship, Dombey, to offer a word on another aspect of it. Sir,' says the Major, with the horse's cough, 'the world in these things has opinions, which must be satisfied.'

    'I

  2. Biggest number: twenty million

  3. Colors:

    dict_items([('black', 258), ('white', 183), ('red', 184), ('yellow', 28), ('brown', 45), ('blue', 111), ('green', 77), ('silver', 38), ('gold', 53), ('purple', 13), ('orange', 5), ('pink', 5), ('gray', 10)])

9.1. Names:

{'Bill Bitherstone', 'Stephen', 'Halloa', 'Dover', 'Master Barnet', 'Long', 'Thou', 'Loose', 'Mr. Silverman', 'Monsieur Defarge', 'Robert', 'Quebec', 'George Goff', 'Great', 'Transcendentalism', 'Stryver', 'Memory', 'Shaker Village', 'Mrs Baps', 'Macbeth', 'Woe', 'Temple Bar', 'Grunt', 'Character Beneath', 'Adelina', 'Joey B', 'Robby', 'Jane', 'Heavens', 'Lake Champlain', 'Walker', 'Josh', 'Jacques', 'Pitt', 'Yonder', 'Success', 'Emma Gordon', 'Captain Sherman', 'Major Allison', 'Bayswater', 'Miss Pipson', 'Miss Bule', 'Afraid', 'Taste', 'Young Toots', 'Alexander', 'Guilty', 'Hartford', 'Gardens', 'Trimmer', 'Essay', 'Mr. Stryver', 'Jacques Four', 'Gradgrinds', 'Major Gally', 'Samuel Thurston', 'Mills', 'Northward', 'Joseph Gradgrind', 'Parnassus', 'Walter Gay', 'Show Concord', 'Howe', 'Johnny Cake', 'Rachael', 'Earth', 'Well', 'Yorick', 'John Bunsby', 'Simple', 'Childerth', 'Henry', 'Drawn', 'Jim Blake', 'Martyrdom', 'Stop', 'Blackpot', 'Mr. P. Arpin', 'Mr. Swidger', 'Domber', 'Mr. Cruncher', 'Gorgon', 'Miss Josephine Sleary', 'Boy', 'Noah', 'Line', 'Universal Distrust', 'Cheerless', 'Patriots', 'Citizen Doctor', 'Wiggs', 'Rumour', 'Master Skettles', 'Birth', 'Close', 'Antoine', 'Jane Gradgrind', 'Dombei', 'Louis Fifteen', 'Nature', 'Lessons', 'Thick', 'Laura Bridgman', 'Great Britain', 'Honest Captain Cuttle', 'Alfred Starling', 'Billy Joper', 'Hoofs', 'Citizen Gabelle', 'Quickly', 'Mount', 'Boys', 'Sir Barnet Skettles', 'Driven', 'Anna Dominoes', 'Monseigneur', 'Defoe', 'Luckily', 'Joe Dennis', 'Ikey', 'Captain', 'Later Tellson', 'Lady Scadgers', 'Miss Pankey', 'Old Stephen', 'Caddo Gazette', 'Wisdom', 'Cave', 'Bartlemy Fair', 'John', 'Mr. Verity Hawkyard', 'Miss Griffin', 'Dark', 'Soon', 'Chancery', 'Toodles', 'Walks', 'Clown', 'Doctor Crocus', 'Mrs Richards', 'Lengths', 'Ease', 'Mr. Childers', 'John Bunyan', 'Miss Toxin', 'Man', 'Tush', 'Shakspeare', 'Benton', 'Upper Sandusky', 'Granville', 'Thank GOD', 'Nearer', 'Glubb', 'Stand', "'Well", 'Think', 'Minerva', 'Dotnbey', 'Redlaw', 'Edith', 'Lucie Darnay', 'Doctor Blimber', 'Joseph Bagstock', 'Otherwise', 'Alpha', 'Heavily', 'Jack Governor', 'Hotel', 'Joseph', 'Joey', 'Bismillah', 'Twenty Pounds', 'Mr. Lorry', 'Look', 'Temperance', 'London Bridge', 'Christian', 'Brother Gimblet', 'Ally', 'Lombard Street', 'Dinner Hall', 'Mr. Robbins', 'Quite', 'Dickson', 'Mr. Charles Darnay', 'Noise', 'Queen', 'Towlinson', 'York', 'Mrs Staggs', 'Poor Mr Toots', 'Right', 'Married', 'Richards', 'Cincinnatus', 'Williams', 'Rapped', 'Sainte Guillotine', 'Master Cruncher', 'Deptford', 'Miss Tox', 'Saint Antoine', 'Dombeys', 'Unless', 'Grinder', 'Angel', 'Shaken', 'Lord Chesterfield', 'Paul Dombey', 'Scott', 'Quiet', 'Delight', 'Physician', "God A'mighty", 'Months', 'Arpin', 'Jury', 'George Gradgrind', 'Owners', "Haply 't", 'Preston', 'Master Paul', 'Bearin', 'Syd', 'Mama', 'Scorn', 'Bed', 'Solomon John', 'Gas', 'Fawkes', 'Madison', 'John Gradgrind', 'La Guillotine', 'Hurry', 'Reid', 'Aged', 'Mr. Crunches', 'Behave', 'Father Mathew', 'Gaspard', 'Jack Adams', 'Help', 'Happy Florence', 'Sleary', 'Galileo', 'Husbands', 'Gratitude', 'Brandy', 'Poor Carton', 'Kentucky Tom', 'Graves', 'Harvey', 'Summer', 'Prison Discipline', 'Isaac', 'Tommy Screwzer', 'Randal', 'Lord Feenix', 'Calm', 'Hilary', 'Romuluses', 'Coop', 'Young Barnet', 'Izaak Walton', 'Tiddler', 'Tozer', 'Providence', 'Alexandre Manette', 'Speak', 'Freedom', 'Care', 'Mr. Jerry Cruncher', 'Poor', 'Black', 'Sharkey', 'Harriet Carker', 'Thank Heaven', 'Parker Peps', 'Strange News', 'Old Bailey Spi', 'Diedrich Knickerbocker', 'Obey', 'Wild', 'Buggins', 'Nowt', 'Joe Bagstock', 'Guy Fawkes', 'Adieu', 'Mr. Granville Wharton', 'Missus', 'John Milton', 'Master B', 'Hay', 'Vixen', 'Willingly', 'Midnight Murder', 'Church', 'Bryanstone Square', 'Bench Bar', 'Mr. Chadwick', 'Stone Lodge', 'Anon', 'Pan', 'Queenston', 'Gillespie', 'Grand Vizier', 'Brother John', 'Mariner Time', 'Boz', 'Baby', 'Thingummy', 'Son', 'Mr. Greenough', 'Whereas', 'Curious', 'Shooter', 'Ground', 'Lewis', 'Charles Evremonde', 'Greater', 'Witness Cousin Feenix', 'Faith', 'Cleopatra', 'Benjamin', 'School', 'Classes', 'Creation', 'Passion', 'Slight', 'Daughter', 'Montreal', 'Magsman', 'Cross', 'Wild Ass', 'Captain Cutlle', 'Wolf', 'Emma', 'Messenger', 'Mark Lane', 'Harlequin', 'Harriet', 'Jacques Five', 'Old Sol', 'Breakfast', 'Jaundiced Jail', 'Patty', 'Liggs', 'Air', 'James Cottingham', 'Sardana', 'Round', 'Lewiston', 'Champagne', 'Wait', 'Peg', 'Deep', 'Anxious', 'Camden Town', 'Sissy', 'Say', 'Toby Magsman', 'JOHN', 'Public', 'Stick', 'Slowly', 'Great Heaven', 'Samson', 'Hell Gate', 'Miss Dombey', 'Phenomenon', 'Kidderminthter', 'Facts', 'Walter Gay Walter', 'Pantheon', 'Mary Queen', 'Charley Swidger', 'Hence Monseigneur', 'Giddiness', 'John Carker', 'Bunsby', 'Seen', 'Bloody Island', 'Kenilworth', 'Hot', 'Doe', 'Cicero', 'Florence Dombey', 'Toby', 'William', 'Slavery', 'Madge Wildfire', 'Saint Antoine Quarter', 'Lady Fareway', 'Tommy', 'Fool', 'East India', 'Johnny Tetterby', 'Burgess', 'Bounderby', 'Little Rifle', 'William Hine', 'Licker', 'Babylon', 'Leonora', 'Mr. Harthouse', 'Seltzer', 'Balls Pond', 'Mr. Fareway', 'Cheer', 'Change', 'Peace', 'Mr. Carton', 'Harris', 'Rend Foulon', 'Rule Britannia', 'Cruncher', 'Laughter', 'Major Gillespie', 'Bright', 'Pull', 'Captain Cuttle', 'Hurrah', 'Wolfe', 'Castlereagh', 'Medusa', 'STRAW HAT', 'Biler', 'Jarber', 'Crypt', 'Liberty', 'Poor Justice', 'Schools', 'Josephine', 'Mound', 'Gay', 'Monsieur Paris', 'Mr. Arndt', 'Barnum', 'Monsieur', 'Debt', 'Griffin', 'Guard', 'Chaney', 'Tabby', 'Anno Domini', 'Potter', 'Clark', 'Mother', 'John Herschel', 'Dim', 'Mariner', 'Columbus', 'Therry', 'Solomon Pross', 'Porter', 'Michaelmas Term', 'Edward Cuttle', 'Englishman', 'Dennis', 'Charles MacStinger', 'Mind', 'Clock Room', 'Wisconsin', 'Perch', 'Alice', 'Free', 'George Washington', 'Latin', 'Paul Jones', 'J. Bagstock', 'Miss Floy', 'Foulon', 'Josh Bagstock', 'Jacques One', 'Write', 'Grace', 'Quarter Sessions', 'Seven Dials', 'Brave', 'Coketowner', 'Sam', 'Giant', 'John Solomon', 'Major', 'Dutchman', 'Equally', 'Royal', 'Thleary', 'Sandusky', 'Miss Florence', 'Warbler', 'Hastily', 'Fame', 'Brother', 'Noakes', 'Young Thomas', 'Cassim Baba', 'Life', 'Bad Fortune', 'Publicly', 'Brother Parksop', 'Briggs', 'Sol Gills', 'Walter', 'Virgil', "Mrs Dombey's", 'Run', 'Sad', 'Bill', 'Martyr', 'Mr. Beaver', 'Food', 'Single', 'Bury', 'Persons', 'Pond', 'Remus', 'Nipper', 'Old', 'Susan', 'Sir', 'Miss Pross', 'Heart', 'Danger', 'Mrs Chic', 'Paralysis', 'Dowager', 'Ally Loo', 'Cuttle', 'Hooroar', 'Shake', 'Doctor Parker Pep', 'Chiefly Matrimonial', 'Banks', 'Damp', 'Miss Bounderby', 'Judas Iscariot', 'Captain Gills', 'Mr. William Button', 'Steady', 'Rather', 'Pegler', 'Robbed', 'Louisa Chick', 'Wife', 'Mrs Blimber', 'Jackson', 'Apples', 'Blacksmith', 'Lowell', 'Parker', 'Brother Hawkyard', 'Grundy', 'Clean', 'Vagabond', 'Tilson', 'Limited', 'Cook', 'Abudah', 'Mrs Bokum', 'Uncle', 'Somewhere', 'Woman', 'Unbidden', 'Books', 'Whether Bunsby', 'Mouse', 'Framed', 'Mystery', 'Within', 'Wi', 'Beyond', 'Torn Gradgrind', 'Temperance Songs', 'Shawnees', 'Miss Fareway', 'Oliver Caswell', 'Tom', 'Mother Bunch', 'Miff', 'Objects', 'Bar', 'Jerry', 'Broadway', 'Sorrow', 'Methought', 'Massachusetts Asylum', 'Madame Defarge', 'Homer', 'Mr. Kidderminster', 'Grown', 'Little Paul', 'Loo Gradgrind', 'Bird Waltzes', 'Old Joe', 'Aggerawayter', 'Least', 'Miss Nipper', 'James Surgette', 'Mary Tofts', 'Mill', 'Truly', 'Mr. Gradgrind', 'Miss Susan Nipper', 'Poor Paul', 'Heir', 'Witness Major Bagstock', 'Vistors', 'Son Dombey', 'Ology', 'Lucie Manette', 'Adam', 'Sydney', 'Toodle', 'Express', 'Yo', 'Lord Burleigh', 'Misses Brown', 'Strike', 'Strange', 'Marry', 'Timid', 'Young Jerry', 'Cly', 'Nobody', 'Toots', 'La Trappe', 'Unsought', 'Poor Berry', 'Scots', 'Lengthy', 'Lucretia Tox', 'Ships', 'Mr. Harker', 'Lord Nelson', 'God', 'Saint Paul', 'Took', 'Chateau', 'Sheep', 'Pale', 'Kentucky Giant', 'Baps', 'Dickenson', 'Stakes', 'Falsest', 'Queen Victoria', 'French Lewis', 'Never', 'Mr. Willet', 'Warwick', 'Lucie', 'Slackbridge', 'Bah', 'Mr. Redlaw', 'Mary', 'Carton', 'Molten', 'Mr. Robins', 'George Silverman', 'Manuel', 'Rich', 'Mr. Magsman', 'Ogress', 'Chairmen', 'Wales', 'Master Bitherstone', 'Pet', 'Mrs Brown', "'which", 'Jerry Cruncher', 'Master', 'Evremonde', 'Evil', 'Hunger', 'Laura', 'Jack Bunsby', 'Jerusalem', 'Set', 'Immense', 'Better', 'St. Louis', 'Pipchin', 'Old Foulon', 'Mr. Thorn', 'Alcohol', 'Merry', 'Prairie', 'Esquire', 'Impassive', 'Struck', 'Whereupon Doctor Crocus', 'Woodman', 'Pontefract', 'Mr. Cooper', 'Clothes', 'Tox', 'Number One', 'James River', 'Inn', 'Edith Granger', 'Harthouse', 'Tellson', 'Christ', 'Miss Berry', 'Beethoven', 'Robinson', 'Sol', 'Missis', 'Louder', 'Water', 'Mr. James Harthouse', 'Gulliver', 'Hoghton Towers', 'Venus', 'Babel', 'Master Dombey', 'Mr. Catlin', 'Mr. E. S. Baker', 'Bankrupts', 'Robert Potter', 'Mr. Ralph Waldo Emerson', 'Miss', 'Spitfire', 'Turnham Green', 'Cannon', 'Devil', 'Wilful Murder', 'Suspicion', 'Lovely Peg', 'Mister', 'Nine', 'Halifax Harbour', 'Parade', 'Black Hollow', 'Major Bagstock', 'Hereupon', 'Imagine', 'Rhode Island', 'Spy', 'County Ark', 'Crash', 'Lady Jane Finchbury', 'Picture Room', 'Terence', 'Fact', 'Miscreant Foulon', 'Junior', 'Nurse Wickam', 'Manager', 'Mr. Dickens', 'Potomac Creek', 'Alas', 'Shakespeare', 'Tom Gradgrind', 'Mr. Carlyle', 'Happiness', 'Albany', 'Saint Francis Xavier', 'Cecilia Jupe', 'Darwen', 'Something', 'Gimblet', 'Old Mrs Pipchin', 'Smith', 'Barbados', 'Feeder', 'Paris', 'Miss Louisa', 'Joes', 'Legion', 'Dearest Florence', 'Henry James', 'Poor Florence', 'Boldface', 'Merrylegs', 'Shaker', 'V', 'James Carker', 'Rose', 'Jack', 'Good Mrs Brown', 'Whereby', 'Power', 'Hallo', 'Dear Doctor Manette', 'Destiny', 'Place', 'Young Mr. Thomas', 'Funerals', 'John Derrick', 'Gabelle', 'Buffalo', 'Defarge', 'Mrs Pipchin', 'Vitellius', 'Bob', 'Night', 'Baker', 'Niblo', 'Thomas', 'Saint Pancras', 'Papa', 'Bottles', 'James', 'Nephew', 'Ketch', 'Mint', 'Jamaica Harbour', 'Lord Mayor', 'Banker', 'Gooseberry', 'Jem Carker', 'Socrates', 'Johnny Cakes', 'Bless', 'Richard Whittington', 'Queer Street', 'Hoity', 'Louisville', 'Upper Canada', 'Earthly Copper Gallon', 'Family Mansion', 'Try', 'Solomon', 'Matrimonially', 'Lord', 'Josiah Bounderby', 'Coketown Hands', 'Tickle', 'Keys', 'Skettles Junior', 'Robinson Crusoe', 'Villain Foulon', 'Imperious', 'Mr. Undery', 'Jefferson', 'Belleville', 'Native', 'Darnay', 'Carthages', 'Derrick', 'Mr. William Swidger', 'Gad', 'Dinner', 'Fall', 'Vide Poche', 'Blackburn', 'Mrs Dombey', 'Herein', 'Mrs Skewton', 'Warwick Castle', 'Tumbler', 'Believe', 'Romulus', 'Divine Right', 'Niagara', 'Market', 'Sleep', 'Tom Thumb', 'Henceforth', 'Bitherstone', 'Sabbath', 'Might Mr. Harthouse', 'False', 'Grinders', 'Chowley', 'Lorry', 'Maria', 'Hebrew', 'Please', 'Marshal', "Edith 'It", 'Parks', 'Shave', 'Jupiter', 'Martha', 'Bad Mrs Brown', 'Ifs', 'Zobeide', 'Mint Julep', 'Silverman', 'Roar', 'Free Press', 'Lake Ontario', 'Monsieur Gabelle', 'Jamaica', 'Cape Clear', 'Thomas G. Allison', 'Lake Erie', 'Louisa', 'Mr. Granville', "Mr. M'Kane", 'Beast', 'Manchester', 'Bull', 'Dickens', 'Cinderella', 'Brig Place', 'Arabian', 'Fat Ladies', 'Coketown', 'Really', 'Barbados Harbour', 'Light Monsieur', 'Grise', 'Time', 'Folly', 'Miss Tox Lucretia', 'Kirby', 'Mr. Radley', 'Ruins', 'Thomas Gradgrind', 'Damme', 'Papa Teach', 'Old Solomon', 'Carelessly', 'Mr. Darnay', 'Monsieur Orleans', 'Clapham Rise', 'Robin', 'Jarvis Lorry', 'States', 'Lucifer', 'Already', 'Familiar', 'Citizen Evremonde', 'Goodbye', 'Mr. Paap', 'Mr. Webster', 'Midshipman', 'Blot', 'Mrs Wickam', 'Mr. Hart', 'Sir Joseph Bagstock', 'Crush', 'Kate', 'High', 'Jiddy', 'Di', 'Gloomy', 'Denounced', 'Perfect', 'Berinthia', 'Mr. Sparsit', 'Money', 'Miss Harriet', 'Paul', 'Sermuchser', 'Loo', 'Mr. Hawkyard', 'Hell', 'Ross', 'Treason', 'Ladybird', 'Eager', 'Chicken', 'Monsieur Manette', 'Poverty', 'Berry', 'Mr. Barsad', 'Susan Nipper', 'Naturally', 'Turn', "'you", 'Fluffy', 'Carondelet', 'Jim Crows', 'Parksop', 'Whether Miss Tox', 'Dead', 'Good Heaven', 'Blind', 'Poor Briggs', 'Mount Auburn', 'Far', 'Merry Christmas', 'Inside', 'Captain John Bunsby', 'Sabbaths', 'Eas', 'Post Office', 'Always', 'Tom Tiddler', 'Far West', "Mr. Lorry's", 'Antony Bagstock', 'Mrs Miff', 'Chick', 'Pipson', 'Mrs Granger', 'Steam Power', 'Sir Hubert Stanley', 'Echoing Footsteps', 'Ed', 'Almost', 'Beard', 'Bitzer', 'State', 'Public Prosecutor', 'Correct', 'Judy', 'Withers', 'East', 'Whether', 'Yes', 'Dear Carton', 'Pill', 'Betsey Jane', 'Livery', 'Carker', 'Wulturs', 'Spoke', 'Chase', 'Lee', 'Mr. Ross', 'Tea', 'Mr. William', 'Sydney Carton', 'Fresh', 'Piccadilly', 'Saint', 'Mrs Blockitt', 'Sir David Brewster', 'Dombey', 'Old Sol Gills', 'Lovely', 'Peggy', 'Yellow Jack', 'Saint George', 'Bench', 'Cousin Feenix', 'Dwell', 'Madam', 'Rip Van Winkle', 'John Wilburn', 'Spinner', 'Odours', 'Crocus', 'Hands', 'James R. Vinyard', 'Pitcher', 'Mr. Thomas Gradgrind', 'Old Joe Bagstock', 'John Barsad', 'Fancy Ball', 'Reason', 'Stricken', 'Miss Carker', 'Fifty', 'Whether Mr Dombey', 'Signor Jupe', 'Business', 'Poetry', 'Mr. Fall', 'Awake', 'Said Mr. Fareway', 'Gone', 'Busy', 'Street', 'Master Kidderminster', 'Nickits', 'Old Time', 'Fanny', 'Vinyard', 'Sir Richard Whittington', 'Number Three', 'Highly', 'Roman', 'Rachel', 'Fancy', 'Velveteen', 'Le Sage', 'Family', 'Ned Cuttle', 'Three', 'Bankruptcy', 'James Harthouse', 'Lots', 'Bagstock', 'Thomas Gradgrind Esquire', 'Pepper', 'Hope', 'Royalty', 'Everywhere', 'Captain Cuttle Put', 'Sit', 'Major C. Gally', 'Fragile', 'Soda', 'Wal', 'Respectin', 'Jemima', 'Poor Susan', 'Jade', 'Saint Dunstan', 'Shadows', 'Florence', 'Pool', 'Liverpool', 'Shame', 'Billiard Ball', 'Johnson', 'Machinery', 'Midnight', 'Doctor So-and-so', 'Ocean', 'Hilary Term', 'Jacques Three', 'Myra', 'Solomon Gills', 'Jew', "Sir Barnet's", 'Harrisburg', 'Job', 'Streets', 'Majesty', 'Banquet', 'Miss Manette', 'Habeas Corpus', 'Ernest Defarge', 'Timber Doodle', 'Asylum', 'Lady Skettles', 'Pittsburg', 'Mesrour', 'Wally', 'Little Dombey', 'Diogenes', 'Monsieur Charles', 'Bubler', 'Johnny', 'Mount Vernon', 'Hooray', 'Mr. Tom', 'Mrs', 'Miller', 'Arthur', 'Christmas', 'Notre Dame', 'Mrs Chick', 'Punch', 'Reins', 'Towards Soho', 'Boast', '_I_', 'Bravo', 'Brogley', 'De Foe', 'Negress Caroline', 'Thus', 'Dutchmen', 'Famished', 'Sally', 'Bully Stryver', 'Lucretia', 'Poor Fanny', 'John Brown', 'Double Room', 'Whereupon', 'Feenix', 'Hill', 'Powler', 'Miss Lucie', 'Thtick', 'Madeira', 'MR. JAMES', 'Gravesend', 'Mr. Sleary', 'Staggs', 'Alexander MacStinger', 'Cauliflower Act', 'Darkness', 'Beauty', 'Away', "Ed'ard Cuttle", 'Deeper', 'Trump', 'Landlady', 'Peter Piper', 'Rathcal', 'Polly', 'Quadrilles', 'Mr. James Gillespie', 'Healthy', 'Ellie', 'Ned', 'Cocker', 'Haroun Alraschid', 'Joey Bagstock', 'Mrs Perch', 'Mrs MacStinger', 'Stephen Blackpool', 'Young', 'Calomel', 'Orthography', 'Five', 'Sales', 'Hatchet', 'Nick', 'Kingston', 'Enough', 'Ville', 'Thus Florence', 'Beautiful', 'Jeffries', 'Fox', 'Perkins', 'Listen', 'Pankey', 'Miss Gradgrind', 'Philip', 'Mith Thquire', 'Mr. Jarvis Lorry', 'Streaker', 'Heavenly', 'Nay', 'Rain', 'Frenchman', 'Finchley', 'Doctor Parker Peps', 'Edit', 'Ivory', 'Tooley Street', 'Peruvian Miner', 'Fareway', 'MR. BOUNDERBY', 'Josiah Bounderby Esquire', 'Mrs Toots', 'Six', 'Guineas', 'Southcott', 'Someone', 'Barnet', 'Mr. Blackpool', 'Monsieur Lorry', 'Ginger', 'Simon', 'Shadows Mr', 'Neptune', 'Berry Mrs Wickam', 'Houses', 'Beef Shop', 'Coffee House', 'Twinkle', 'Hush', 'Engine Fireman', 'Skettles', 'Walters', 'Everybody', 'Chrisen', 'Baggs', 'Escape', 'Happily', 'Curse', 'Spirit', 'True', 'Haply', 'La Prairie', 'Lady Cankaby', 'Toodle Junior', 'Blue Beard', 'Jim', 'Rome', 'Infidel', 'Slightly', 'Little Hatchet', 'Anyone', 'Sir Barnet', 'Crafty', 'Wickam', 'Dolt', 'Normandy', 'John Bull', 'Cornelia Blimber', 'Whether Mrs MacStinger', 'Left', 'Hamlet', 'Pretty', 'Many', 'Charles', 'Curtius', 'Miss Blimber', 'Moosulmaun', 'Dust', 'Captain Bunsby', 'Home', 'Father', 'Citizen Manette', 'Lottery', 'Elizabeth', 'Lilburn W. Baggs', 'Sole', 'Thquire', 'Hem', 'Death', 'Magog', 'Alfred', 'Plautus', 'Hide Charles', 'Hock', 'Hawkyard', 'Dress', 'Madame', 'Mine', 'Joe', 'Cornelia', 'Cupids', 'Darkest', 'Seraglio', 'Somebody', 'Grief', 'Spies', 'Caius Marius', 'Citizen', 'Philadelphia', 'Art', 'Vengeance', 'Simply', 'Skewton', 'Mum', 'Debauchery', 'Often', 'Mr. Joseph Smith', 'Society', 'Drownded', 'Said Mr. Granville', 'MR. ARPIN', 'English', 'Sissy Jupe', 'Sir Parker Peps', 'Haggard Saint Antoine', 'Little Gunpowder', 'Dumb', 'Awast', 'Mr. Jeremiah Cruncher', 'Fourthy', 'La Force', 'Young Gay', 'Charles Dickens', 'Uncle Sol Captain Cuttle', 'Mr. Arpin', 'Chesapeake Bay', 'Furnished', 'Tetterby', 'Scholar', 'Tony Lumpkin', 'Rage', 'Meaner', 'Dick Jones', 'Silent', 'Mr. Mitchell', 'Law', 'See', 'Floy', 'Brown', 'Table Rock', 'Mr. Bridgman', 'George', 'Blimber', 'Habit', 'Cupid', 'Future', 'Take', "Mr. M'Choakumchild", 'Victim', 'Edward', 'Steamingine', 'Adam Smith', 'Heaven', 'Inquest', 'Jerusalem Buildings', 'Hark', 'Giles', 'Jest Book', 'Polly Toodle', 'Lady Jane Grey', 'Baxter', 'Belinda Bates', 'Ovid', 'Mrs Impudence', 'Caliph Haroun Alraschid', 'Cincinnati', 'Uncle Sol', 'Anthony', 'Aunt', 'Soho', 'Joseph B', 'Jupe', 'Oliver', 'Mr. Sully', 'Sooth', 'Cousin', 'Mildew', 'Knife', 'Partially', 'Anne', 'Public Life', 'Green', 'Certainly', 'Mr. Thomas', 'Grosvenor Square', 'Jollson', 'Wind', 'Chops', 'Magna Charta', 'Cove', 'Widge', 'Enterprise', 'Pray', 'Homeless', 'Quick', 'Love', 'STEPHEN', 'Swollen', 'Barsad', 'Mrs Toodle', 'Mr. Taylor', 'Whereupon Florence', 'Sparsit', 'Doctor', 'Sleepy Hollow', 'Eve', 'Doctors', 'Marquis', 'Thank God', 'Mean', 'Haroun', 'Bishop Butler', 'Gradgrind', 'Roger Cly', 'Dances', 'Lighter', 'Mary Daws', 'Rob', 'Arcade', 'Milly', 'Likewise', 'Jove', 'Pod', 'Judas', 'Angels', 'Levi', 'Edith Dombey', 'Alone', 'Michaelmas', 'Mister Toots', 'Doctor Manette', 'Happy New Year', 'De Lampert', 'Jem', 'Barker', 'Charles Darnay', 'Little', 'Lady', 'Josiah', 'Millennium', 'Brook Street', 'Pilkins', 'Marwood', 'Thus Miss Pross', 'Dishonest Faction', 'Anywheres', 'Sir Gaston Fareway', 'Solemn', 'Little Paul Dombey', 'Master Briggs', 'Gills', 'Careless', 'Ratcliffe', 'Railroad', 'Pardon', 'Forward', 'Arms', 'Manette', 'Turk', 'Timely Provision', 'Adams', 'Scanty', 'Oho', 'Twelfth Night', 'Mr. Bounderby', 'Granger', 'Castle', 'Hail Columbia', 'Dear Di', 'Angry', 'Divine Sermon', 'Gentleman', 'Swiftly', 'Loo Bounderby', 'Citizen Defarge', 'Married Females', 'Good', 'Juliana MacStinger', 'Holly', 'Shadowy', 'Husband', 'Giles Scroggins', 'Moloch', 'Walk', 'Fire', 'Frederick', 'Gog', 'Poor Walter', 'Hall', 'Trottle', 'Blackpool', 'Seven', 'Belinda', 'Omega', 'Bush', 'Charleston'}

9.2. Most popular name:
('Dombey', 1780)

Vector space based Information Retrieval System for Text Processing - Information retrieval

Information Retrieval: Text Processing Group 13 Sequence of operations Install Requirements Add given wikipedia files to the corpus directory. Downloa

1 Jan 01, 2022
Python flexible slugify function

awesome-slugify Python flexible slugify function PyPi: https://pypi.python.org/pypi/awesome-slugify Github: https://github.com/dimka665/awesome-slugif

Dmitry Voronin 471 Dec 20, 2022
Repositori untuk belajar pemrograman Python dalam bahasa Indonesia

Python Repositori ini berisi kumpulan dari berbagai macam contoh struktur data, algoritma dan komputasi matematika yang diimplementasikan dengan mengg

Bellshade 111 Dec 19, 2022
A slugifier that works in unicode

Unicode Slugify Unicode Slugify is a slugifier that generates unicode slugs. It was originally used in the Firefox Add-ons web site to generate slugs

Mozilla 315 Nov 21, 2022
从flomo导出的笔记中生成词云

flomo-word-cloud 从flomo导出的笔记中生成词云 如何使用? 将本项目克隆到你的电脑上,使用如下的命令,安装所需python库 pip install -r requirements.txt 在项目里新建一个file文件夹,把所有从flomo导出的html文件放入其中 运行main

Hannnk 9 Dec 30, 2022
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.

Python User Agents user_agents is a Python library that provides an easy way to identify/detect devices like mobile phones, tablets and their capabili

Selwin Ong 1.3k Dec 22, 2022
This is an AI that is supposed to say you if your text is formal or not

This is an AI that is supposed to say you if your text is formal or not. It's written in Python 3 and has some german examples (because I'm german yk) in the text.json file. This file contains the te

1 Jan 12, 2022
This project aims to test check if your RegExp are being matched by grep.

Bash RegExp This project aims to test check if your RegExp are being matched by grep. It's a local server that starts on the port 8080. It runs the se

Quatrecentquatre 1 Feb 28, 2022
Auto translate Localizable.strings for multiple languages in Xcode

auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install

Wesley Zhang 13 Nov 22, 2022
text-to-speach bot - You really do NOT have time for read a newsletter? Now you can listen to it

NewsletterReader You really do NOT have time for read a newsletter? Now you can listen to it The Newsletter of Filipe Deschamps is a great place to re

ItanuRomero 8 Sep 18, 2021
一款高性能敏感词(非法词/脏字)检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。

一款高性能非法词(敏感词)检测组件,附带繁体简体互换,支持全角半角互换,获取拼音首字母,获取拼音字母,拼音模糊搜索等功能。

ToolGood 3.6k Jan 07, 2023
A Python app which can convert normal text to Handwritten text.

Text to HandWritten Text ✍️ Converter Watch Tutorial for this project Usage:- Clone my repository. Open CMD in working directory. Run following comman

Kushal Bhavsar 5 Dec 11, 2022
Convert English text to IPA using the toPhonetic

Installation: Windows python -m pip install text2ipa macOS sudo pip3 install text2ipa Linux pip install text2ipa Features Convert English text to I

Joseph Quang 3 Jun 14, 2022
Bidirectionally transformed strings

bistring The bistring library provides non-destructive versions of common string processing operations like normalization, case folding, and find/repl

Microsoft 352 Dec 19, 2022
The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity

Contents Maintainer wanted Introduction Installation Documentation License History Source code Authors Maintainer wanted I am looking for a new mainta

Antti Haapala 1.2k Dec 16, 2022
Phone Number formatting for PlaySMS Platform - BulkSMS Platform

BulkSMS-Number-Formatting Phone Number formatting for PlaySMS Platform - BulkSMS Platform. Phone Number Formatting for PlaySMS Phonebook Service This

Edwin Senunyeme 1 Nov 08, 2021
Shows twitch pay for any streamer from Twitch leaked CSV files.

twitch_leak_csv_reader Shows twitch pay for any streamer from Twitch leaked CSV files. Requirements: You need python3 (you can install python 3 from o

5 Nov 11, 2022
AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

AnnIE - Annotation Platform, tool for open information extraction annotations using text files.

Niklas 29 Dec 20, 2022
汉字转拼音(pypinyin)

汉字拼音转换工具(Python 版) 将汉字转为拼音。可以用于汉字注音、排序、检索(Russian translation) 。 基于 hotoo/pinyin 开发。 Documentation: http://pypinyin.rtfd.io/ GitHub: https://github.co

Huang Huang 4.2k Jan 03, 2023