2011-06-12

GDom: three weeks in!

Hello again.



On the planet, you may have just seen me publish last week's blog post an hour ago.  These blog posts take a bit of care for me, since I don't want to rush garbage out to so many readers.  I send a simpler point-form report out to the gsoc mailing list (which did go out last week) and then usually write the nicer blog post the next day.  Waiting for some editing, I actually forgot to put the last one up after getting busy.  Sorry!  Here is the current progress, though.



Past week



Scheduling and keeping up-to-date

 

I'm getting better at keeping the git repository for GDom up to date, though still not quite as regularly as my dutiful mentor Alberto Ruiz has requested.  I do have some school work this summer as part of my Masters that takes up part of my earlier week, so I'm going to shift my GSoC work earlier to the start of the week to ensure that I can have everything ready for Friday evening, rather than sending out my mailing list reports late on Friday night and following up with the blog post later. 



Document, Element, and Node



The support for these three interfaces is almost complete now.  Document will need some novel work on loading and saving from GIO streams, but the interfaces pass the unit tests.  Speaking of which, these three interfaces all have beautiful unit tests now.   There's probably still a little room for improvement, but I'm moving on to focus on other classes now.



Attr



I actually intended on finishing implementing this and creating tests for it this past week.  The implementation was working mostly but I ran out of time for its test suite, which I'm working on today.



Refactoring



This probably isn't too interesting, but my initial plans for the class hierarchy have had to be refactored a couple times (didn't take too long), because not all interfaces described in the spec correspond perfectly to libxml2's DOM support.   We now have a base DomNode that describes the DOM spec's interface, and a BackedNode that does a general implementation of it using libxml2's Xml.Node.  Then, other interfaces that use libxml2's Xml.Node are based on BackedNode, while interfaces that don't just base off of DomNode, and sometimes require some local data.  This might get messy when saving files back, though.  Yikes!



Upcoming



Build System



When I talk to my mentor Alberto Ruiz next, I hope to discuss autotools versus WAF.  I keep meaning to investigate it more myself, but ended up with Internet disruptions last week (went to Ottawa for academic reasons and my accommodation's Internet was out) and was generally busy, so this is now an item for this week. 



GIO integration



Need to figure out the best way to use GIO to load and save files by stream for our underlying libxml2 usage.  Currently, I just have Document reading from a Unix file path.  Hehe. 



Attr test case



Finish this.  My general approach has been to at least stub out all the properties and methods and try obvious implementations for them first (usually doing something with an underlying libxml2 structure), and then going through the spec to write tests for the class.  The tests usually reveal all the tricky points in my handling of libxml2 data, or my class hierarchy, and I make another pass to get the class compliant.  It's FUN!



live.gnome.org/XML updates



I have a list of things to go up there, now that I'm into the project, including


  • list of similar projects and how we differ

  • performance test cases and results (haven't started measuring yet, though)

  • list of features and suggestions to be considered in the future, after GSoC


I think I'd also like to put a simple class diagram up.



Good day



Thanks for reading this far.  Also, as noted in my previous post, if you have specific test cases (e.g. large or complex files) that you'd like to see supported, please send them to me: aquarichy at gmail dot com.



 Cheerio!

GDom, after two weeks

This was actually supposed to go up last Monday, but got missed in the busy-ness. I did send my weekly report to the GSoC mailing list, though those reports are less verbose :)




GDom is now two weeks old.  Here's what we have so far:


  • Most of the DOM spec's functionality for Node and Document is working

  • Lots of other classes work partially

  • Beginnings of test suites for Document, Node, and Element


Enough of the base should be laid now that I'll be able to make better commits against git.  I've often found myself in states of not-compiling as I tried to flesh out large chunks of API at once for expediency.



Past week



I managed to do more research, more fully reading the DOM Level 1 Core spec (rather than my quicker review of it when I started) and it answered many
of the questions I encountered after writing code the previous week.



Performance



Many people have offered a lot of suggestions regarding the direction of
GDom so far and I got to discuss them with my mentor this morning [last Monday]. 
To ensure something useful gets created and implemented well within the
GSoC time frame, GDom will remain limited to essentially wrapping
libxml2 and complying with the DOM Level 1 Core spec.



In the comments from last week [two weeks ago], there was notable concern about performance of libxml2 and GObject.  GDom will not be directly addressing those.  It will simply be providing a much nicer API for GNOME developers to work with in most cases.



That said, I am quite interested in the resulting performance and would like to start testing performance as I go along, once most of the base has been laid.  I'll probably simply measure the memory and time performance differences between libxml2 and GDom and identify some choke points.  I don't expect GDom to be faster, but I hope for it to not be significantly worse.   After the Summer of Code is over and GDom can provide a pretty, standard, GNOME-y API, it will be worth considering a faster, more efficient implementation of the interface.  This is something I'm interested in pursuing, and would be something I would have started with this summer except for the strict time frame and progress requirements.



Request for you, XML-using reader!



If you have any specific test cases (e.g. work loads you'd like me to consider like super huge files you deal with :D), please send or explain them to me.  I will test with those and see what I can do to ensure GDom isn't useless to you :D



My e-mail: aquarichy at gmail dot com



Features



Some people have named some features they'd like to see on top of GDom that go beyond the spec.  I'm going to track these and keep them as a list at live.gnome.org/XML to keep them in mind and consider pursuing them after the summer.



Similar Projects



So far, I've encountered a few similar projects and had one noted to me.  They're all similar in that they handle XML documents and they're written in Vala.  They also no longer seem active and/or have had a goal of implementing functionality rather than reusing libxml2.



They were


 If you know of another project that I should peek at and consider, let me know.  :D



Tests



I've started implementing GTests to guide the implementation of details for the classes.  Hopefully this will increase API compliance and correctness, so anyone already familiar with the DOM API can use the code without surprises.



Next Week [plans for this past week, actually]


  • Talk to my mentor Alberto Ruiz about the build system.  I don't know autotools too well, and WAF was brought up as a potential solution.  

  • More implementation and testing for Document, Element, Node, and Attr.

  • Ensure I commit and push files more regularly.


Dieses Blog durchsuchen

Labels

#Technology #GNOME gnome gxml fedora bugs linux vala google #General firefox security gsoc GUADEC android bug xml fedora 18 javascript libxml2 programming web blogger encryption fedora 17 gdom git emacs libgdata memory mozilla open source serialisation upgrade web development API Spain containers design evolution fedora 16 fedora 20 fedora 22 fedup file systems friends future glib gnome shell internet luks music performance phone photos php podman preupgrade tablet testing typescript yum #Microblog Network Manager adb apache art automation bash brno catastrophe css data loss debian debugging deja-dup disaster docker emusic errors ext4 facebook fedora 19 gee gir gitlab gitorious gmail gobject google talk google+ gtk html libxml mail microsoft mtp mysql namespaces nautilus nextcloud owncloud picasaweb pitivi ptp python raspberry pi resizing rpm school selinux signal sms speech dispatcher systemd technology texting time management uoguelph usability video web design youtube #Tech Air Canada C Electron Element Empathy Europe GError GNOME 3 GNOME Files Go Google Play Music Grimes IRC Mac OS X Mario Kart Memento Nintendo Nintendo Switch PEAP Selenium Splatoon UI VPN Xiki accessibility advertising ai albums anaconda anonymity apple ask asus eee top automake autonomous automobiles b43 backup battery berlin bit rot broadcom browsers browsing canada canadian english cars chrome clarity comments communication compiler complaints computer computers configuration console constructive criticism cron cropping customisation dataloss dconf debug symbols design patterns desktop summit development discoverability distribution diy dnf documentation drm duplicity e-mail efficiency email english environment estate experimenting ext3 fedora 11 festival file formats firejail flac flatpak forgottotagit freedom friendship fuse galaxy nexus galton gay rights gdb german germany gimp gio gjs gnome software gnome-control-center google assistant google calendar google chrome google hangouts google reader gqe graphviz growth gtest gtg gvfs gvfs metadata hard drive hard drives hardware help hp humour ide identity instagram installation instant messaging integration intel interactivity introspection jabber java java 13 jobs kernel keyboard language language servers languages law learning lenovo letsencrypt libreoffice librpm life livecd liveusb login lsp macbook maintainership mariadb mario matrix memory leaks messaging mounting mouse netflix new zealand node nodelist numix obama oci ogg oggenc oh the humanity open open standards openoffice optimisation org-mode organisation package management packagekit paint shedding parallelism pdo perl pipelight privacy productivity progress progressive web apps pumpkin pwa pyright quality recursion redhat refactoring repairs report rhythmbox rust sandboxes scheduling screenshots self-navigating car shell sleep smartphones software software engineering speed sql ssd synergy tabs test tests themes thesis tracker travel triumf turtles tv tweak twist typing university update usb user experience valadoc video editing volunteering vpnc waf warm wayland weather web apps website wifi wiki wireless wishes work xinput xmpp xorg xpath
Powered by Blogger.