Creating Sitemaps for Very Large Websites is Tricky

Over the weekend, I started working on creating a sitemap for my business research project. I made one permanent url per company page and one permanent url per sic category.

The result is a theoretical sitemap that is about 605,000 items long. That’s a lot of XML, which means a LOT of data, which means the sitemap.xml is like… 20-50 megabytes. What could possibly go wrong.

Continue reading

Machine Learning Use Cases Per Algorithm

When you start learning about machine learning, its easy to be overwhelmed by the sheer amount of different types of machine learning there are. Here is a list of the types of algorithms that one can use, and when you should use them.

Continue reading

Future Topics For the Serial Podcast

Now that Serial, Season 1, is over, I’ve been thinking about some stories that would be better told in the Serial format, slowly examining one angle at a time, with the soul of the story building over time.

Continue reading

The Time I Met Joan Rivers

I occasionally get invited to attend performances at a Resort / Casino in the Palm Springs area by a family member who has hookups there. Usually what happens is we’ll get seats to a performance, and often before the show, my wife and I will get the opportunity to meet and greet with the performer. Usually, the performer is kinda in a trance, stuck in their own thoughts, presumably preparing mentally for their performance, and the meet and greet is more like a quiet handshake and a picture, and that’s that. But one time, Joan Rivers did a show there, and in the meet and greet, she was clearly a much different kind of person than many of the other performers that we’d been able to meet, and I also got some interesting insight into what kind of business woman she is. It was illuminating.

Continue reading

What is DBPedia and How Do I Use It?

DBPedia, in general, is a linked-data data extraction of Wikipedia. If you’ve been living under a rock and don’t know what Wikipedia is, its a crowd sourced encyclopedia hosted on the internet. In terms of data structure, Wikipedia reports on its own wiki page that it is powered by clusters of Linux servers and MySQL databases, and uses Squid caching servers in order to handle the 25,000 to 60,000 page requests per second that it gets on average. In terms of the product, it is very culturally significant in that it is one of the most referenced sources of general information on earth, if not the outright leader. Again, DBPedia, for all intents and purposes, is a linked-data version of that dataset.

Continue reading

Answer Synthesis is the Future, Let me Tell You What It Is

The act of computationally creating an answer via cognitive computing or conceptual reasoning rather than searching for it with text curiously gets described in so many ways, but nobody ever seems to talk about it directly, its always a talked about in terms of how it is done. I propose we call it “answer synthesis”. Let’s dig deeper.

Continue reading

What Type of Problem is Ubiquitous Computing, Really?

Ubiquitous Computing, as a term, has been around for quite some time now. It refers to a state of computing in which there is a presence of data, interfaces, computing, etc, that is essentially omnipresent and is available for interaction in a wide variety of forms for a wide array of purposes. In essence, when people talk about the Internet of Things, they usually are describing what others refer to as ubiquitous computing. One of the aspects of this paradigm that makes it ubiquitous is a somehow-universal interoperability between all things connected.

Also, separate from that, there should be a sense of ambient intelligence that persists around all of these interacting agents. Obviously, interoperability, intelligence, high-availability, access, security, communication, data interoperability, data analysis, prediction, etc, are all under the umbrella of the term. However, is all of this really needing to be solved in order to have the user experience of having interoperability and ambient intelligence? I think not. Either way, there are lots of things to think about when it comes to putting your finger on what the real problems are that are left to solve in this space.

Continue reading

Is Semantic Web Dead or Alive?

Semantic web is alive, and I will tell you why. But first, let me tell you how I arrived at this conclusion.

When I first came to my current job, I was tasked with writing an automated implementation of Schema.org as a service, which could be implemented by multi-site owners as a way to shortcut the tagging and structuring of their site data for the sake of acquiring rich snippets, and ultimately to get better search engine performance.

During that time, I learned a lot about schema.org, semantic web technologies, linked data, and Google. So, with that said, if you’re here wanting to know if you should care about the semantic web, let me drop some knowledge.

Continue reading