2020-05-26

[Technology] Data Debt: part 3

While in school, and for a period after, I neglected my systems of organization.  I continued to accumulate information, but without spending effort to manage it.  This has resulted in a large data debt. 

I have
  • years of e-mail, chat and text logs in various formats that I want to de-duplicate and consolidate.
  • thousands of screenshots and saved images of memes and wallpapers and artwork from some of my favourite video games and media sitting in massive folders.
  • years of photos grouped just by date taken (surprisingly useful) with no indication of people present (actually not a huge problem given Google Photos)
  • thousands of To Do items spread across various formats and systems, that have no priority, they just end up in an endless queue that only ever grows.
  • old websites, blogs, accounts that are redundant or obsolete.
  • projects that need starting, video games that need completing.
Part of my problem is hope.  I have hope.  Hope that this can all be managed.  Hope because I have the skills necessary.

Part of my problem is that I am late.  There are good opportunities, especially at the time of transitioning from one process to another, or at the time I acquire a record, to act then and keep-up.  They were missed.  And now I have debt

I just need the time.  So here I am, stealing an hour here, an hour there, out of my busy days, not dealing with problems of today, but dealing with a self-created problem of years past, for a goal that, like my dad's, may be pointless.

How many photos do I ever review?
How many e-mails will I ever re-read?
How many ToDo items do I ever expect to complete?

My alternative is recency.  Focus on what is recent and applicable to me today.  But I know I can do it, and I know that, like my father sifting through his decades of paper data, there are gems to be found.  Memories to recall.  People that I do not want to lose as the unfrequented connections in my brain fade to make room for more recent experience.

I have created a large data debt, one that might be manageable with the programming skills I have, and maybe more manageable as AI continues to improve.  But does it matter?  Probably not.  Just my vanity.

[Technology] Data Debt: part 2

As just noted, my father managed to accumulate quite a bit of information in physical form that he now spends time sorting through and discarding what has no value after all, which is most of it.

I think about that in relation to the data I keep, mostly digital.  Files upon files upon files.  Photos upon photos.  A vain effort to preserve every text message and e-mail forever.  A vain war against data loss, against entropy, against chaos.  Vain, my vanity.

I am not a fan of the paradigm of search first.  I love being able to meaningfully search my information.  I do not believe, at present, or perhaps ever, that search will be capable of being able to completely and comprehensively report back to me the information I seek.  I can only search for what I know exists.  Artificial intelligence has made search much more useful than ever before; e.g. I can search for "cake" in Google Photos and I get hundreds of photos, mostly actually of cake!  But I also know many more photos that I have taken of cake are missing, and there are photos not of cake present.  Precision, accuracy and recall are not at 100%, yet, and given the indefinite nature of some definitions, can they ever be.

Organize.  A dead motto.  Another vanity project.  Just as I do not expect search to be able to accurately identify all of the most relevant results for me, I do not believe that I can adequately and exhaustively categorize all of my information in a complete way.

But between the two, I still prefer organizing over searching.  I still prefer a reasonable hierarchy of folders and some futile "tags".  It scales a bit better.  Perhaps I cannot granularly categorize every record I have, but I can at least definitely categorize them into broad groups that I can add specificity to as needed.

What is the major obstacle that I have?  Volume of information.  I collect too much information.  And I am generally unwilling to let any detail go.  Storage is cheap.  I have a 1TB SSD that I purchased for 80CAD, and it stores tens of thousands of photos and millions of lines of text from the past decade, in a small block that can fit into the palm of my hand.

And my flexible approach to organizing it all works fine... if I can keep up.  But my data debt grows faster than I can spend my daily available time to manage it.

[Technology] Data Debt: part 1

Technical debt.  Time debt. 

My father is a bit older now, and he has been going through his Stuff.  Not just big physical Stuff like objects and thingy things.  But also informational Stuff, like receipts, letters, newspaper clippings, magazines, mileage notebooks.  Stuff he read or took notes on over the past several decades that have accumulated into quite a many boxes (formerly of shoes)or Piles or Stacks.  Sometimes, bags.

He's looking at all of this informational Stuff, and wondering, what was the point?  He definitely found value in some of it.  One of his joys in the past few years has been surfacing a memory of some event, of something he read or did, and then searching through his many records to find the Evidence of it.  I can understand the thrill.  Finding that newspaper clipping of something he did in our small town, once, or a letter from a friend from 2003, and hey, maybe Richard can find the phone number of the sender now, and hey, I did, and now my dad has renewed contact after a decade and a 1/2.

But there's a lot that he will never need, and never really did need in the end.  He has been meticulous in recording how many kilometres he has driven between gas refills, how much gas he added, and how much it cost.  It has given him a good sense of his spending in that regard, and of his cars performance.  It is all jotted in tiny notebooks.  Nothing digital.  Nothing that he can load easily into a spreadsheet now, and review larger trends.  And, honestly, numbers that he has not needed for three decades.

I love this history that he has collected, these peeks into his life and mind and priorities over the past several decades.  I do not get to enjoy them so directly, though.  I am reluctant to dig through his Stuff, it feels like an invasion of privacy.  He is shredding or recycling stuff now that he does not expect to need after all, which is most of it.  Saving information for rainy days that mostly never came.

I find it interesting that his approach is to review it all.  It definitely has had merit.  He finds random points of interest and it creates conversation topics for us.  It used to be something he could show his friends at the coffee shop, back when coffee shops were places where you could sit with friends.  But there is so much, and so much he finds is of no value to him.  I am pretty sure this year he will give up in the face of the sheer volume of information he has collected in the past several decades and just say good bye to it all.  It was never that useful anyway.

Dieses Blog durchsuchen

Labels

#Technology #GNOME gnome gxml fedora bugs linux vala google #General firefox security gsoc GUADEC android bug xml fedora 18 javascript libxml2 programming web blogger encryption fedora 17 gdom git emacs libgdata memory mozilla open source serialisation upgrade web development API Spain containers design evolution fedora 16 fedora 20 fedora 22 fedup file systems friends future glib gnome shell internet luks music performance phone photos php podman preupgrade tablet testing typescript yum #Microblog Network Manager adb apache art automation bash brno catastrophe css data loss debian debugging deja-dup disaster docker emusic errors ext4 facebook fedora 19 gee gir gitlab gitorious gmail gobject google talk google+ gtk html libxml mail microsoft mtp mysql namespaces nautilus nextcloud owncloud picasaweb pitivi ptp python raspberry pi resizing rpm school selinux signal sms speech dispatcher systemd technology texting time management uoguelph usability video web design youtube #Tech Air Canada C Electron Element Empathy Europe GError GNOME 3 GNOME Files Go Google Play Music Grimes IRC Mac OS X Mario Kart Memento Nintendo Nintendo Switch PEAP Selenium Splatoon UI VPN Xiki accessibility advertising ai albums anaconda anonymity apple ask asus eee top automake autonomous automobiles b43 backup battery berlin bit rot broadcom browsers browsing canada canadian english cars chrome clarity comments communication compiler complaints computer computers configuration console constructive criticism cron cropping customisation dataloss dconf debug symbols design patterns desktop summit development discoverability distribution diy dnf documentation drm duplicity e-mail efficiency email english environment estate experimenting ext3 fedora 11 festival file formats firejail flac flatpak forgottotagit freedom friendship fuse galaxy nexus galton gay rights gdb german germany gimp gio gjs gnome software gnome-control-center google assistant google calendar google chrome google hangouts google reader gqe graphviz growth gtest gtg gvfs gvfs metadata hard drive hard drives hardware help hp humour ide identity instagram installation instant messaging integration intel interactivity introspection jabber java java 13 jobs kernel keyboard language language servers languages law learning lenovo letsencrypt libreoffice librpm life livecd liveusb login lsp macbook maintainership mariadb mario matrix memory leaks messaging mounting mouse netflix new zealand node nodelist numix obama oci ogg oggenc oh the humanity open open standards openoffice optimisation org-mode organisation package management packagekit paint shedding parallelism pdo perl pipelight privacy productivity progress progressive web apps pumpkin pwa pyright quality recursion redhat refactoring repairs report rhythmbox rust sandboxes scheduling screenshots self-navigating car shell sleep smartphones software software engineering speed sql ssd synergy tabs test tests themes thesis tracker travel triumf turtles tv tweak twist typing university update usb user experience valadoc video editing volunteering vpnc waf warm wayland weather web apps website wifi wiki wireless wishes work xinput xmpp xorg xpath
Powered by Blogger.