The Lost Language Recovery Trick - counting an undeciphered script

Ғылым және технология

Got an undeciphered writing system you can't read? No problem! Here's how to start cracking a lost script: use your digits.
I figured it was time to reconvene the Decipherment Club for this one. As Robinson points out in the introduction to Lost Languages, counting the number of distinct glyphs in an undeciphered text is a convenient trick to keep in mind when you start deciphering a writing system. Linguists and decipherers have been using it since at least the days of Assyriologist AH Sayce.
In principle, it's a simple little algorithm that we can use to make an educated guess about the type of writing system used in a lost text. You can get a feel for this simplicity in my short code demo at the end.
Credits for images and sound effects also used in Thoth's Pill - see link in that description box:
• Thoth's Pill - an Anim...
CC-BY and public domain images used in this video but not in Thoth's Pill:
Rongorongo, msdstefan
Rongorongo G verso, Chauvet
Klosterkällaren, Albabos
Léon Cogniet's portrait of Champollion
Sequoyah, Bird King and Inman
Hittite Seal of Tarkummuwa, Walters Art Museum
Phaistos Disc, C Messier
Archibald Sayce, George McCready Price
Rongorongo with index numbers:
Michael Everson, "Draft Unicode proposal for Rongorongo"
Music by Kevin MacLeod (incompetech.com):
Cambodian Odyssey, Vadodara Chill Mix
Music by Josh from NativLang (soundcloud.com/Botmasher):
Oowah

Пікірлер: 159

  • @GexGenesis
    @GexGenesis6 жыл бұрын

    Please continue this series for us linguaphiles :(

  • @xarjef8509
    @xarjef85096 жыл бұрын

    Dude, please finish this serie

  • @beeble2003

    @beeble2003

    3 жыл бұрын

    By the way, in English, "series" is both singular and plural -- it's not the plural of "serie", which isn't a word.

  • @ellies_silly_zoo

    @ellies_silly_zoo

    3 жыл бұрын

    I think it was either just a simple misspelling or a joke indicating that the series hasn't been finishe

  • @beeble2003

    @beeble2003

    3 жыл бұрын

    @@ellies_silly_zoo No, it's almost certainly just a mistake. In many Romance langauges (I was aware of French and Italian, but Spanish and Portuguese also do this), the singular noun is "serie". That coupled with the fact that "series" looks like a plural in English leads many speakers of Romance languages to assume that "serie" is the English singular, too. I wouldn't have even mentioned it, if it wasn't such a highly characteristic error of Romance-speakers using English as a second language -- even people whose English is near-perfect very often make this mistake.

  • @ellies_silly_zoo

    @ellies_silly_zoo

    3 жыл бұрын

    @@beeble2003 Yo I'm not arguing about something this unimportant

  • @cactussenpai9625

    @cactussenpai9625

    3 жыл бұрын

    @@ellies_silly_zoo lmao. Interesting mistake, though not worth arguing abou

  • @beeble2003
    @beeble20037 жыл бұрын

    In cryptography, the idea of counting the symbols is known as "frequency analysis". I don't know what linguists call it but it would make sense to use the same term.

  • @inkyscrolls5193

    @inkyscrolls5193

    7 жыл бұрын

    As a linguist I can confirm that yes, we use the same term.

  • @Caesim9

    @Caesim9

    7 жыл бұрын

    In frequency analysis you also consider which symbol or group of symbols is there more often. For example the word "the" is pretty often in english texts.

  • @Bentleytalksaboutstuff

    @Bentleytalksaboutstuff

    2 ай бұрын

    @@Caesim9 Yes, this can work with words and symbols.

  • @JudahCaruso
    @JudahCaruso8 жыл бұрын

    One of the more underrated channels on here. Love the content guys!

  • @NativLang

    @NativLang

    8 жыл бұрын

    Working hard not to be the underdog anymore :D Thank you!!

  • @agustin.santiago.gutierrez
    @agustin.santiago.gutierrez3 жыл бұрын

    How come this series hasn't been continued? These 4 introductory videos are just pure gold!!!! We want more of this!

  • @PluTiD
    @PluTiD8 жыл бұрын

    This is incredible! I've always wondered how lost scripts are deciphered! Love your channel and these videos! You're currently my main source of procrastination at work!

  • @NativLang

    @NativLang

    8 жыл бұрын

    +PluT0iD 733 I strive to make these as distracting as possible! :D

  • @j.t.hartzfeld1368
    @j.t.hartzfeld13687 жыл бұрын

    Python is probably the coolest language you've showcased on this channel. :-P

  • @NativLang

    @NativLang

    7 жыл бұрын

    Especially when it eats the other languages and spits out numbers!

  • @j.t.hartzfeld1368

    @j.t.hartzfeld1368

    7 жыл бұрын

    +

  • @beeble2003

    @beeble2003

    3 жыл бұрын

    Eh. Having the exact amount of whitespace between two "words" be significant is a really bad idea, IMO. And not having to declare variables is a great way to introduce bugs into programs.

  • @LoLrand0mness
    @LoLrand0mness8 жыл бұрын

    yo, let your script run through your script. wanna know what it would say about itself :D

  • @NativLang

    @NativLang

    8 жыл бұрын

    Very meta!

  • @freyja5800

    @freyja5800

    7 жыл бұрын

    if the comments with the hiragana, kanji etc are part of ther script logographic/logophonetic, else alphabetic

  • @weirdflexbutok9743
    @weirdflexbutok97436 жыл бұрын

    Can't wait for part 5!!!

  • @rohanpandey2037
    @rohanpandey20378 жыл бұрын

    Finally! Thank you for restarting your regular videos, I'm glad that comp-chomp thing is over. The first few videos of comp-chomp were interesting, but after a while it got pretty repetitive and boring

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Rohan Pandey Thanks for the feedback! It was fun (and exhausting!) to have YT push us to try something new, but I'm very comfortable getting back to normal. :D

  • @ChristianJiang
    @ChristianJiang8 жыл бұрын

    Very interesting! What a big coincidence, I wondered about this yesterday!

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Christian Jiang Whew, just in time! I'd hate to leave you wondering.

  • @danielburke9536
    @danielburke95363 жыл бұрын

    Please continue this series 🙏

  • @basilforth
    @basilforth7 жыл бұрын

    Enjoyed these videos! Looking forward to more in the future!

  • @Reubentheimitator6572
    @Reubentheimitator65726 жыл бұрын

    Please make another video in this series

  • @sion8
    @sion88 жыл бұрын

    When I 1st heard of Rongo-Rongo, I was sad to find out it has yet to be unlocked. As far as I know linguists think is more than likely writing instead of proto-writing, but it could have some mnemonic features, I'm not even sure if they have compared it with the Rapa Nui language or for that matter any other Polynesian language. I hope one day is unlock so that maybe the Rapa Nui can gain more of their native culture back same with the various Andean peoples and the Quipu knot system which could be a form of storing more than just numbers but maybe even some small words as I've read some archaeologist think it does.

  • @hectordanielsanchezcobo7713

    @hectordanielsanchezcobo7713

    2 жыл бұрын

    for some reason when I first read of it I got scared

  • @sion8

    @sion8

    2 жыл бұрын

    @@hectordanielsanchezcobo7713 Scared? Why?

  • @hectordanielsanchezcobo7713

    @hectordanielsanchezcobo7713

    2 жыл бұрын

    @@sion8 idk, maybe the unknown scared me back then

  • @sion8

    @sion8

    2 жыл бұрын

    @@hectordanielsanchezcobo7713 It's just writing, just like Egyptian hieroglyphs or Sumerian Cuneiform.

  • @hectordanielsanchezcobo7713

    @hectordanielsanchezcobo7713

    2 жыл бұрын

    @@sion8 yeah idk

  • @yoavshati
    @yoavshati7 жыл бұрын

    I would guess that each symbol (3:55) is a word, it has too much detail to write long texts with each one being a single letter or a sylablle

  • @Hal2718

    @Hal2718

    7 жыл бұрын

    Yoav Shati The same thing was once thought about Mayan writing, but that turned out to not be entirely correct.

  • @nicolekortstam

    @nicolekortstam

    6 жыл бұрын

    Maybe some symbols are diphthongs or trifthongs, how do you think about that?

  • @RandidTheBandit
    @RandidTheBandit Жыл бұрын

    please continue the series !!

  • @nicolekortstam
    @nicolekortstam6 жыл бұрын

    I just love this channel.

  • @yashyadav-ik2en
    @yashyadav-ik2en4 жыл бұрын

    I really really loved these. Please focus on finishing this sereis first guys

  • @kpaukeaho6180
    @kpaukeaho61808 жыл бұрын

    Mahalo nui keia! As a linguistics enthusiast currently learning Hawaiian, I really enjoy this channel!

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Mark Stoleson 'A'ole pilikia! I enjoyed reading your thoughts on Tahitian and Hawaiian taboo systems and look forward to having you watch in the future. A hui hou.

  • @ferretyluv
    @ferretyluv7 жыл бұрын

    Please do more on this! I keep seeing all these videos that say "come watch this next video!" And I can't find it. Please complete your series!

  • @scarlettestanley3391
    @scarlettestanley33913 жыл бұрын

    Yes! Please do come back and tell us all you know and understand!

  • @CapnButtrflyBritches
    @CapnButtrflyBritches7 жыл бұрын

    I love this channel!

  • @legaltenderradfem
    @legaltenderradfem3 жыл бұрын

    Amazing channel, thank you !

  • @adrianlopez5019
    @adrianlopez50192 жыл бұрын

    ¡EXCELENTE...!!!! ... fascinante Rongo-Rongo....

  • @malori9293
    @malori92938 жыл бұрын

    I adore this!!

  • @NativLang

    @NativLang

    8 жыл бұрын

    Thank you! I have another video coming out on Friday, too!

  • @touisbetterthanpi
    @touisbetterthanpi8 жыл бұрын

    These are wonderful

  • @johannes-euquerofalaralema4374
    @johannes-euquerofalaralema43745 жыл бұрын

    Gut gemacht!

  • @AdarableKitten
    @AdarableKitten7 жыл бұрын

    will there be more of these!! I'm so intrigued!! I want to know what the creepy beast looking gliphs are meaning

  • @MalaysianTropikfusion
    @MalaysianTropikfusion8 жыл бұрын

    I love your videos so much!

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Zulhilmi Ghouse (Bubbles) Thanks for being a regular commenter! :D

  • @kaisaheikkila
    @kaisaheikkila6 жыл бұрын

    How about hieroglyphics and similar writing systems? It is basically an abjad but with a vast amount of determinatives. Is there any possibility of separating the letters from the determinatives based on the frequency they appear or places they appear in a text? Or is the near impossibility of this exactly the reason why it took the Rosetta stone (and her little brother from Philae) to decipher hieroglyphics?

  • @eduardtronciu9786
    @eduardtronciu97866 жыл бұрын

    Please do the continuation of the Decipherment Club series!

  • @craigcollings5568
    @craigcollings55687 жыл бұрын

    encouragement!!!!!

  • @dlwatib
    @dlwatib6 жыл бұрын

    You might also have to look at line endings to determine which direction the writing is in, or even to determine whether or not the direction is consistently left or right.

  • @NinuRenee
    @NinuRenee8 жыл бұрын

    I really like your videos, very fun to watch! Could i suggest making a short video on comparing the brahmic scripts, there are so many of them and they all look the same, for me it's near impossible to recognize them when I encounter one

  • @NativLang

    @NativLang

    8 жыл бұрын

    +uneliasmarsu Thank you!! Have you seen the Thoth's Pill "alphasyllabaries" episode? Creating that, I thought of getting more into the differences between scripts. It got messy fast. Such an intriguing family of scripts. Thanks for the suggestion!

  • @avii2807
    @avii28078 жыл бұрын

    Very interesting! A few of your Thoth's pill and language videos have actually inspired me to create a system of my own. It's simply a cross between an alphasyllabary and an alphabet. You write the consonant (21) and the vowel mark (6) above or below but if the consonant or the vowel is repeated, you write the base form of the consonant and vowel! Your videos have inspired me to do this much and I can't wait to see even more! Verelle sen! (Thank you!)

  • @NativLang

    @NativLang

    8 жыл бұрын

    Thanks for sharing a bit about your creation! What are your letters styled like - any particular script? Are they something entirely unique? So happy you like Thoth's Pill and hope to keep you watching :D

  • @avii2807

    @avii2807

    8 жыл бұрын

    +NativLang I have styled them in accordance to their IPA description. For example, the letter "m" is a voiced nasal plosive so you write a loop before writing a curve. The entire writing system is designed to show how each phoneme is pronounced. I could send you my written version. As for the system, it functikns similar to an Arabic abjad, there is an isolate, final, central and initial style however vowels are written above of the consonant if they are fronted and at the bottom if they are backed. But the vowels themselves have individual symbols so my writing system functions as an alphasyllabary (abuguida), an alphabet and somewhat like an abjad! A three-ish way system!

  • @avii2807

    @avii2807

    8 жыл бұрын

    +NativLang I absolutely love your videos as tgey inspired me to write my own language called "Sorrelic"

  • @NativLang

    @NativLang

    8 жыл бұрын

    Smart! Sounds more linguistic than any of the writing systems I've created. What kind of language is Sorrelic?

  • @avii2807

    @avii2807

    8 жыл бұрын

    +NativLang It's a language my great grandfather invented and passed it down. I am a fourth generation Sorrelic speaker. We used only the Roman Alphabet so I, being the youngest among us Sorrelic speaker, was tasked with creating the system. Here are a few examples. Hello - Adra Bye - Avte Goodbye - Avtetakt (farewell) And a few long words Airplane - Aerantrasinette (Aerial transport) Democracy - Takamrivaleaosnasit (Controlled by the people) We have only 5000-7500 true pure words. Any other words beside those are defined by prefixes and suffixes but we have some loan words. Government - Gaverenementi Technology - Tekkonolohiaga

  • @Chrischi3TutorialLPs
    @Chrischi3TutorialLPs3 жыл бұрын

    In theory, you can use frequency analysis to roughly place a language inside a family. Obviously, closely related languages would, for the most part, have a very similar frequency for the majority of letters, save for the ones that changed when pronounciation shifted, whereas others that are more distantly related would disagree more and more as they move apart. You could then use the resulting estimates to make an estimate for the family tree. Thinking about it, biologists actually did that with retroviral DNA. For those who dont know, a retrovirus reproduces by copying its own DNA into that of a host, thus making that host produce copies of it. Those are clearly visible in our DNA, and under specific circumstances, you could have said DNA be passed down, in which case, you could use the mutations occurring inside those pieces of DNA to build a phylogony of closely related species, as, when the species diversify, the mutations present in that DNA would also diverge (You can also track when this divergence happened by counting the number of mutations present, btw) and so youd have a family tree as a result. This was actually done on the great apes. The results were pretty much what we expected.

  • @aerynrowe5574
    @aerynrowe557416 күн бұрын

    Would still like more!

  • @frankharr9466
    @frankharr94667 жыл бұрын

    You know, the Latin and Cyrillic scripts are pretty unique for having so many phonetically identical versions of the same graphemes. I mean, yes, in Arabic, you can have up to four versions of each letter, but that's based on the technique of writing it and that's a technique not being used on this disk. I'm just noting that we may have a better idea of how many symbols there are in this text than we think we do. Oh, and YAY! The program works!

  • @danielpealer3561
    @danielpealer35617 жыл бұрын

    As I was watching this it occurred to me as I listened to this that if a script falls into the category of an Alphabet, that some of the characters in the script may be not entirely obvious word dividers, For Example in English and most other modern languages we currently use blank spaces to divide words, but another technique was used in Runic scripts were dots that were used as word separators, Is there a way (perhaps linked to the Zipf distribution of the script and of similar scripts) to estimate what the average character length of words is? If we can do this we may be able to figure out what characters are candidates for word dividers. (I know the terms I use aren't quite right but I'm a programmer so I think of characters not graphemes)

  • @kennypeere9146
    @kennypeere91462 жыл бұрын

    Hey, you got me thinking. Rongorongo is turned 180° with every new line. What if they were questions and answers or simple dialogues? As if both writers would be sitting across the tablet. Do you know if this theory has been proposed or researched yet?

  • @hglundahl
    @hglundahl6 жыл бұрын

    It is certain that the English tradition says Tolkien invented The Hobbit and Lord of the Rings as leasure reading. How likely would the alternative be, him finding a book in Adunaic and in tengwar and deciphering all that?

  • @RafaelRabinovich
    @RafaelRabinovich7 жыл бұрын

    Could you use your script with the Voynich manuscript?

  • @Salsmachev
    @Salsmachev6 жыл бұрын

    I notice this video is in a playlist marked as incomplete. Any plans to come back to this project?

  • @osmanika8741
    @osmanika87413 жыл бұрын

    "With a smile on our face" Shows character with no mouth

  • @EpicFishStudio
    @EpicFishStudio7 жыл бұрын

    here is super minimalistic version of your script input file location, and it does your conclusion with open(input("file location")) as file: c=len(set(list(file.read())) if c

  • @screamtoasigh9984
    @screamtoasigh99846 жыл бұрын

    How would your script know if it's an alphabet if it has symbols around it like the diacritics (nikudot) Hebrew has and sometimes uses. If you have a collection of texts & didn't know.

  • @hglundahl
    @hglundahl6 жыл бұрын

    1:05 Leaving out tengwar here, what do you think of recurring 32 symbols, which Genevieve von Petzinger is investigating? I think I mentioned them before, and before viewing this video I thought "32 symbols? could be alphabetic" If you think any "text" she found is too short, how about alphabetic used as mnemotechnics? Adam, Seth, Enos, Cainan, Malaleel, Jared, Henoch, Mathusala, Lamech, Noah, abbreviated as Aleph, Shin, Aleph, Kaph, Mem, Iod, He (one of them!), Mem, Lam, Nun Or more likely sth like (Noah), Japheth, Gomer, Ascenez abbreviated as (Nun), Iod, Gimel, Aleph or (Noah), Japheth, Javan, Tharsis abbreviated as (Nun), Iod, Iod, Thet (or Tau?) etc. It could be from when Noah was predividing Earth between the peoples, and he could have said division should become in force or law at the birth of Phaleg / Peleg. Which, being flouted, was instead followed by a Babel project leading up to another division. But the Babel project leads us already into Neolithic, since now known as Göbekli Tepe.

  • @lucillefrancois150
    @lucillefrancois1506 жыл бұрын

    Finish this series!

  • @cassiekoenigshofer9321

    @cassiekoenigshofer9321

    6 жыл бұрын

    Lucile Francois. U are nothing.

  • @anwardiggs8748

    @anwardiggs8748

    3 жыл бұрын

    @@cassiekoenigshofer9321 no u

  • @ErikNilsen1337
    @ErikNilsen13376 жыл бұрын

    Any plans to finish this series sometime?

  • @jacobtracy7847
    @jacobtracy78472 жыл бұрын

    Could you do one on what we know about linear A?

  • @ladaylyn
    @ladaylyn6 жыл бұрын

    I am full of questions. So they count and identify the characters, then how do they know what it says? if it is educated guess, then do they just guess about the story the scripts are telling? who decides the desifer is accurate? I am left with more questions! :)

  • @harrisonconcord6562
    @harrisonconcord65628 жыл бұрын

    what would the next step be in the process of deciphering the script?

  • @Emily-fs7vd
    @Emily-fs7vd7 жыл бұрын

    Can you do another Decipherment Club video me and some friends are working on creating a language and I'm leading out so I'm trying to decide which form of writing will be most convenient for us

  • @theperpetualprocrastinator9776

    @theperpetualprocrastinator9776

    6 жыл бұрын

    Emily Cairns It really depends on your grammar. If there's little to no inflection than a logographic script like Chinese would be an option. If you have a simple syllable structure and only' a few syllables than a syllabary like katakana would work. On the other hand if all else fails than go with an alphabet.

  • @TheDarkWiiPlayer
    @TheDarkWiiPlayer7 жыл бұрын

    For linguistics stuff I recommend coding in lua; it deals very well with unicode strings and has a lot of useful functions that deal with strings.

  • @TheDarkWiiPlayer

    @TheDarkWiiPlayer

    7 жыл бұрын

    So I just tried this out of curiosity, and I was able to code something similar to what you have in 11 lines of lua code local function f(str, cuts, types) local chars={} for char in str:gmatch"[^%s]" do table.insert(chars, char) end for i=#cuts,1,-1 do if #chars>cuts[i] then return types[i] end end return types[0] end print(f("this is a test string"), {5,10,20}, {"5 or more", "10 or more", "20 or more", [0]="less than 5"})

  • @EpicFishStudio

    @EpicFishStudio

    7 жыл бұрын

    DarkWiiPlayer he did the code in python. here is my pythonese one in just 4 lines. with open(input("file location")) as file: c=len(set(list(file.read())) if c

  • @sethlangston181
    @sethlangston1818 жыл бұрын

    Thai is an anomaly in terms of alphabets. It has 76 letters, including 4 tone markers, which is more than the Japanese syllabary.

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Seth Langston Could we have guessed it's an alphasyllabary without knowing? It does have similar-looking glyphs and special marks (in a way, like Japanese kana diacritics). For me, the Yi syllabary is a real outlier for this trick.

  • @spiralcraft8957
    @spiralcraft89578 жыл бұрын

    haha the second i saw the rondo rondo i was all like i know why no one had deciphered it yet....... because every time someone does there life ends up becoming a call of cthulhu movie and they are never heard from again hehe. What do u render with if i may ask? and loved the vid btw

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Adriaan Ater Haha, is it that eerie? Almost always Blender.

  • @spiralcraft8957

    @spiralcraft8957

    8 жыл бұрын

    +NativLang Well the more i looked at it the worse it became every ''symbol'' worries me lol its like'' they'' the characters in powerful poses created the language around there environment and i know who does that lol that aside every human seems to be either downing,being eaten,mutilated or or tied up under water with pipes from the surface keeping them alive (for snacks later?) at least on the tablet i am looking at. I was already afraid of the ocean hehe. Weird to see a what looks like a whale hooked up to the same pipe structure. The pipes don't seem to connect to the more fish like humanoids.There's lot of information embedded in them much like in sumerain but more crude in the worst way. When i come across a tablet i have not seen i sculpt it out in zbrush but not this one i can smell the ocean from where i sit not that brave lol. Was hoping to one day go to easter island so i am glad i learned of the rondo rondo before i went to the island that would have been bad jeez i would have slowly backed away all the way to the boat lol. As someone who loves to be creative i can make up a few stories of what i think might be happening but i did see what looks like the symbol for metal and for sending metal up. Similar to old sumerain tokens signs. Do you know if there are similar glyphs in native american carvings? feel like i have seen some before haha but maybe it was in the call of cthulhu game on the walls lol. I use blender too well done on the render !!

  • @NativLang

    @NativLang

    8 жыл бұрын

    Thank you! I don't know of any undeciphered Native American scripts outside of Mesoamerica. None of the petroglyphs remind me of these tablets. Making connections between Polynesia and the Americas gets into a whole thing academically, especially when speculating about cultural contact. There have been many attempts at cracking this script. I might tell its story sometime.

  • @tamil_npc
    @tamil_npc3 жыл бұрын

    Would you be kind enough to share the python code you created for this example?

  • @dg-hughes
    @dg-hughes7 жыл бұрын

    When I was a kid I created my own alphabet or whatever it was no not a code i.e. not letter substitution. I wonder what it would say about it but I forget where I put it :( I was getting pretty good at writing it fluently.

  • @olehagen3985
    @olehagen39852 жыл бұрын

    Your trick goes, that you let us take for granted, that hieroglyphs are inevitably language: not say a calendar.

  • @katelillo1932
    @katelillo19326 жыл бұрын

    I love all your topics, but I really hope you can finish this series someday 😊

  • @KyleMielke
    @KyleMielke7 жыл бұрын

    I couldn't help but get intrigued... Run your script on some grade 2 braille. For that matter. what exactly DO you call grade 2 UEB?? Pesudo-syllabic?

  • @parthiancapitalist2733
    @parthiancapitalist27336 жыл бұрын

    Longer is better Giggity

  • @ramadantuncberk2982
    @ramadantuncberk29826 жыл бұрын

    CAN ANY ONE HELP ME I NEED TO FIND A WORD IN HITTITE WRITING THE WORD IS RUNDAS

  • @Arkevorkhat
    @Arkevorkhat6 жыл бұрын

    no X? i remember seeing betwixt at least once.

  • @blkgardner
    @blkgardner8 жыл бұрын

    English has 26 letters, but has more than 26 symbols. The capital and lower-case letters would make 52, adding the digits 0-9 would be 62 symbols. Punctuation symbols would take that number over 70, depending on what is counted: something like "?" or "!" could reasonably be confused as a letter, while "." and "," are too insignificant to be letters in their own right (probably.) Also, a math text would have more symbols than a newspaper, including such things as "+" "%" and the like.

  • @NativLang

    @NativLang

    8 жыл бұрын

    True. It's critical to know what's being counted, and - for this algorithm - to pare it down to the linguistic info. Interestingly, mathematical notation may be "discovered" within an undeciphered text even before the rest is cracked, like Maya. Understanding which glyphs are variants and which are distinct graphemes definitely matters.

  • @blkgardner

    @blkgardner

    8 жыл бұрын

    That seems to be putting the cart before the horse. If you know the A=a or G=g, you probably already know that English has an alphabet as opposed to a syllabary..

  • @NativLang

    @NativLang

    8 жыл бұрын

    The cases I'm aware of build up slowly, with long lists of variants grouped by index number. You might hypothesize that A and a represent distinct syllables, while a colleague asserts that sequences like "Apple" show up after a dot, while "apple" appears in the middle of other letters, suggesting that the two symbols are just variants. Then you show instances of "Apple" not preceded by a dot. The colleague replies that his count would fit known bicameral alphabets so well, while yours would be a poor syllabary for writing a late Germanic language, which this is suspected to be.

  • @DTux5249
    @DTux52496 жыл бұрын

    Y this abandoned it was amazing

  • @claessjlborglindhardt4681
    @claessjlborglindhardt46816 жыл бұрын

    Don't you have to check on something which does not fall into any category to assure it works if nothing is found??

  • @claessjlborglindhardt4681

    @claessjlborglindhardt4681

    6 жыл бұрын

    also do you have a github? profile

  • @recklessroges
    @recklessroges7 жыл бұрын

    Could you github your code so that we can play with it?

  • @susbicious6757
    @susbicious67574 жыл бұрын

    wait where's the rest of the series? :(

  • @faithwright7958

    @faithwright7958

    4 жыл бұрын

    It was never finished.

  • @DanielFrostable
    @DanielFrostable7 жыл бұрын

    i'm a linguist who also programs, it would be great to see more exploration of language through programming

  • @asimqadri2009
    @asimqadri20097 жыл бұрын

    Interesting .. informative at 2:28 .. it is said that alphabets are bw 10-40 ... pl correct its beyond 50 .. for instance.. Sindhi which is spoken in Sindh province Pakistan, and Kach in India has 52 alphabets.

  • @sereysothe.a

    @sereysothe.a

    7 жыл бұрын

    sindhi is written in a modified version of the perso-arabic script. so its not an alphabet rather an abjad

  • @pierreabbat6157

    @pierreabbat6157

    7 жыл бұрын

    If you count letters as if Nagari were an alphabet (e.g. दण्ड has five letters, including two short a's), there are about 50 letters, the exact number depending on the language (some have retroflex l, some have long vocalic r, Sanskrit has vocalic l in forms of one word). An unusually large syllabary is General Chinese Xonn-dzih (=Hàn zì), which has 2082 characters, because it represents two words differently if they are different in some Chinese language.

  • @michaelwatson113
    @michaelwatson1137 жыл бұрын

    Ok. So, how would this work with real Japanese, which uses 4 systems of writing? Think of it. Kanji can be used in multiple ways, that some kana do double, triple, even quadruple duty, or serve as auxiliaries to kanji, romaji which may or may not spell European words or japanized words from god knows what language.

  • @steve1978ger
    @steve1978ger7 жыл бұрын

    UTF-8 symbols have variable byte length. I doubt the script will work correctly for alphabets like Cyrillic or Inuktitut.

  • @krysta-ajhaah-min-yah8368
    @krysta-ajhaah-min-yah83685 жыл бұрын

    Did you finish this?...

  • @anoushkachopra8656
    @anoushkachopra86565 жыл бұрын

    finish the series please

  • @rafikchbaklo
    @rafikchbaklo5 жыл бұрын

    why that list in the video didn't have Arabic? :( it is actually very correctable with everything you're talking here. aleph = alpha = Al in the first of every english word, it is the beginning jst like alpha and the omega, alpha is father, omega is mother, as for in arabic Om = MO-ther and h-OM-e. u see? B in arabic is Ba', ب - باء - means the base of things, basics, basement, bar, become, boat, ball, back, bone, brain. in arabic also words that states (basics) starts with B. Bet means Home in arabic Noun in arabic , the letter N, spelled noun, also means noun in english, for the meaning of letter ن - نون in arabic resembles to depth of meaning behind something, jst as for Jonas story with the whale in the (deep) blue sea, the nun = the whale, jonas knew with nun. Meme in english is also meem in arabic, the letter M = م - ميم . also means the water of things, the meaning or interpretation of something, it is adjustable and flexible just like water. mimic, medic, mother, mom, matter, mind... u get it. Paradise / faradise / Fardaws in arabic - فردوس = heaven what's beautiful is that I still find correlation between nordic and norse religion names correlating with Arabic. and Islamic religion naming and words that hold parables or deep philosophies. and a lot more. thanks anyway and I hope you proceed this series.

  • @aurelianpopescu1151
    @aurelianpopescu1151 Жыл бұрын

    the undeciphered script is called the indus valley script

  • @vanessac.175
    @vanessac.1758 жыл бұрын

    Hey! Just sent you an email under the subject line ‘Optica Entertainment & Creative Nation Inquiry’ if that helps you find it. Hoping to hear back from you! :)

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Vanessa C. Got it! Thanks for watching. KZread's been a great outlet for me but barely a success. Moving away from the business side and scaling down to gain back some time. I appreciate your message though!

  • @autumnbates1967
    @autumnbates1967 Жыл бұрын

    Okay, so I'm gonna throw a hypothetical your way. Suppose you have temples and temples of script, but no knowledge of how to pronounce it, and there is no identifiable language its even remotely related to. Completely new glyphs, and so on. There are some drawings or carvings telling a story, but its not enough to decipher the whole rooms and rooms of text. ( for the purposes of imagination, you can see carvings of a person of power on horseback, fighting off thousands of warriors. this might be a story about a war and a powerful king.) Anyway, is there any way that I could decipher a lost language like that with no correlations to any other known language? @NativLang

  • @adriansegura5229
    @adriansegura52294 жыл бұрын

    Finish the series

  • @WatermelonEnthusiast9
    @WatermelonEnthusiast93 жыл бұрын

    you need to make that code into a website

  • @shinespider
    @shinespider7 жыл бұрын

    Wait, so... you said yourself that a complete stranger to English wouldn't have any way of knowing that uppercase and lowercase letters aren't different symbols. So, if you fed English as it's written into the algorithm, it would detect at least 52 symbols, and that's not even counting punctuation marks. Who'se to say "?" isn't a letter?

  • @muhammadafiqmdjumnan9959
    @muhammadafiqmdjumnan99595 жыл бұрын

    ꧋ꦠꦼꦫꦶꦩꦏꦱꦶꦃ꧉

  • @MultiSciGeek
    @MultiSciGeek8 жыл бұрын

    Good series but really disappointing. It took 3 videos to arrive at almost nothing. You could have explained this much better, faster and included way more stuff to explain it in great detail. Also include more examples.

  • @benjamenYTDeadTheGamer
    @benjamenYTDeadTheGamer7 жыл бұрын

    I swear to god anytime I hear anyone say "ideograph", i hear them say "idiotgraph".

  • @verdakorako4599
    @verdakorako45998 жыл бұрын

    Esperanto lives: kzread.info/head/PLSDbbqExWNDpEfeHjZ3tRzM5kKWB8vpaw

  • @alejandromatosanguis5267

    @alejandromatosanguis5267

    8 жыл бұрын

    ESPERANTO!

  • @stephaniehb8654
    @stephaniehb86542 жыл бұрын

    Please put Quantum Grammar into your code and let me know what it says!!! Please msg me back!!

  • @stevezes
    @stevezes8 жыл бұрын

    text written with just hiragana hahaahah

  • @NativLang

    @NativLang

    8 жыл бұрын

    +Duck Good enough for Heian novels, good enough for me! Oh Japanese writing, why are you systems within systems?!?

  • @tuxcup
    @tuxcup7 жыл бұрын

    Run that script with Vietnamese

  • @ferretyluv

    @ferretyluv

    7 жыл бұрын

    Vietnamese is an alphabet. They use the Roman letters.

  • @tuxcup

    @tuxcup

    7 жыл бұрын

    +ferretyluv I know, just wondering if that script will see the massive amount of diacritics and think it is a syllabary

  • @HappyHusbandnWife

    @HappyHusbandnWife

    7 жыл бұрын

    I know Vietnamese, it used to be different with the modern vietnamese though

  • @gerardvanwilgen9917
    @gerardvanwilgen99173 жыл бұрын

    But using a computer programme is cheating a bit, because knowledge about the characters is already encoded in the computer language.

  • @kiritawhai7488
    @kiritawhai74883 жыл бұрын

    They need to get actual polynesians to dechipher this 😒 Too many bias view from other cultures, maybe try getting the people who made them read them out!? Theres an account of a tahitian man reading these but aparantly its not valid because his speech didnt add up to the glyphs as they "thought".

  • @Hecatonicosachoron
    @Hecatonicosachoron7 жыл бұрын

    I suppose the next step is to try and identify groups of glyphs that are recurring and then find hints on grammatical structure, word separation etc. If educated guesses can be made about the language and sufficiently many sentence examples (in that language) and also inscriptions (in the undeciphered script) then there are brute force methods of looking at glyph histograms and trying to match them up in ways consistent with predicted glyph frequencies... etc... etc...

  • @benjamenYTDeadTheGamer
    @benjamenYTDeadTheGamer7 жыл бұрын

    I swear to god anytime I hear anyone say "ideograph", i hear them say "idiotgraph".

Келесі