Overhauling Realtime Webviews in Scholarly Publishing
For years, counting page views, PDF, and XML downloads at PLOS was difficult. Our methodologies were antiquated, and prone to high cost failures. After auditing our old system, a tangled web of cron jobs, scripts that batch processed log files, and a very old Drupal application, we decided it was best to take it out behind the barn.
With a clean slate, we launched a new system in February that employs a modernized pipeline for counting page views, a new API for passing the data between our applications, and a real-time UI implementation that cuts time to first ALM from 48 hours to minutes.
Before we started writing a line of code, we first needed to answer the fundamental question: What is a page view?
While the answer seems obvious, our soon to be robot overlords make the answer very difficult. It is estimated that over half of all web traffic are automated bots, scripts, and terminators sent from the future flying from link to link. For our purposes, we wanted to do our best to eliminate these from our counts. To do so, we’ve switched from pure logfile parsing to a similar methodology employed by web analytics behemoth, Google Analytics.
This switch eliminated almost all of the bot traffic from the counts. We no longer rely on the honesty of an automated script reporting itself in the User Agent string, but instead require the JavaScript on the page to be executed, which most bots out of pure efficiency won’t do.
Second, we switched from a batch processing system to a real time streaming architecture. To do so, we employed the open source Apache Kafka distributed data streaming platform in order to process one message at a time, rather than 24 hours of logs every night.
Finally, with substantial help from python wizard Sebastian Bassi, we replaced an ailing Drupal application and created a lightweight API using Flask and Swagger, enabling our internal services to consume the real time data.
Now when you view the metrics on an article, you’re viewing the latest and greatest statistics. As a side effect of this new architecture, we now collect and display the view metrics within seconds of the article being published, instead of waiting for the nightly job to run, which often delayed any metrics from showing for almost 2 days. Pretty cool!
10 ways to make wordpress faster. Easy things you can do all by your lonesome.
It’s easy to forget that your first time visitors to your website won’t have a cached version of all of your cool background-images to load in an instant. Especially for first time customers when first impressions are important, it’s imperative that your website loads quickly. Luckily there are some easy ways to optimize your website and make WordPress faster.
I’m not sure how I just found WordPress.tv. I have been kinda bummed that there wasn’t a wordcamp coming to San Diego anytime soon but the video lectures and presentations on here have been super helpful in learning about industry best practices.
This lecture by Chris Coyier (who writes great articles about web development and designed codepen.io) is super useful. He offers 10 ways to make WordPress faster.
His premise for choosing what to talk about is all business. The solutions must :
1. Have a significant effect (no 1%-2% load time improvements or improving things that users don’t actually notice).
2. A single person could do it (don’t have to have a special server side guy doing a bunch of custom crap).
3. Not be complex.
4. Minimal ongoing maintenance
Probably the most important thing you should pull away from this lecture is the breakdown of loading effort between the front-end and back-end of your website. With 80% of load time being spent on loading front end elements, make sure you spend some time optimizing your markup and consolidating CSS and Javascript into single files. Also, if the page is heavy on Javascript, it’s often best call your JavaScript from the footer so your browser isn’t stuck accessing a ton of library files before it begins to load the DOM. Also, you should always use a CDN to deliver libraries like Jquery or YUI. Trust me, let Google or Microsoft’s servers do the heavy lifting for you.
How Microsoft lost my trust
This is a story of love and betrayal. A love struck gamer who grew apart from not one, but two beloved systems, left to watch as his favorite company lost its way.
I consider myself to be a well rounded gamer. By the end of this generation, I owned all three consoles, but if you looked at my library, you can tell I picked a favorite. I’ve red-ringed two 360s, and filled my hard drive to the brim, and my library for XBOX is 3 times larger than the rest of my games combined, but I can’t do it anymore. I can’t, with good conscience support the XBOX One.
It began with Xbox Live. I can understand paying for multiplayer. I get it, you have servers to maintain and systems to manage. I’ll pay for a quality service if it’s worth the money. But then, they held Netflix hostage at the end of a Golden XBox Live knife. Already being a subscriber, I let it slide. Then, like a drug pusher you shoved Microsoft points on your gamers and strangled the digital marketplace. Every time I wanted to buy a DLC or Arcade game, you made me convert my money into some asinine point system that never zeroed out. I’m left staring at the 200 leftover points on my account knowing i’ll never get to spend that money. You can pocket that, and tell your shareholders how you stole a bunch of money by being an inconvenient ass.
Then came the disaster, the messy breakup, and the attempt at making up that has permanently left a terrible taste in my mouth.
Like an overprotective jealous girlfriend, you wanted to check in on me every 24 hours. You wanted to wrap your failed DRM technology around all of your games, letting the world know you’d rather inconvenience all of your paying customers, rather than lose another dollar to a fringe of hackers who weren’t going to buy your stuff anyways.
Worst of all, you went after poor people. You tried to outlaw selling used games, alienating your retailers, and screwing over all of the families who can’t afford buying full price games. I’m glad your profit is more important than people.
Even though the market forced you to reneg on all of your greedy plans, you lost my trust. You’ve made it clear the suits are in charge. You want to let Wall Street build your worlds and they can have the next 6 Call of Duty’s, if they want. I’m not going to spend my limited time on earth playing the same game over and over.
See, for me, video games are an almost spiritual type of escapism. It throws the most advanced engineers and artists to build worlds I would never get to experience. In this generation of consoles, the only company, system, and platform that comes close to my personal philosophy of gaming is Steam, and it’s ambitious open source gaming OS on the SteamBox. Can you imagine being able to play your entire steam library on cutting edge hardware, for the rest of forever? Sales that that throw the triple A titles at you for a couple of dollars.? It’s like a freaking gaming shangrilah.
My friend joked with me that he’ll put his steam box next to his time machine, as if Gabe Newell at Valve, armed with billions of dollars, has ever let him down.
Computer and Human Languages
When I first decided I wanted to work as a programmer, I gave serious consideration to a trend that I wanted to make damn sure I could avoid as long as possible: outsourcing. One day, you have on the job experience, you may have even been a highly skilled and experienced worker, and the next your lifeblood becomes immediately unprofitable. Years of work and practice could vanish practically overnight.
While I think this race to the bottom for cheapest labor possible will eventually flatten out, I didn’t want to be stepping into a career where I would be starting at day zero of my experience at 40 or 50. I may think differently when I’m older.
I asked web developers I knew if they were concerned about being outsourced and largely, everyone seemed pretty confident they weren’t. There’s some obvious disadvantages to working with an outsourced programmer. Trying to coordinate communications across 12 or 13 time zones is a pain in the ass–nobody wants to be on a conference call at 5am or at night when they could be spending time with their family. Language barriers are always difficult to navigate and key elements about your project can fall through the cracks because of a miscommunication. Any money you save in hourly rates is lost to hours of productivity thrown down the drain.
While programming always feels like a very international industry, face to face business is still extremely important. In my opinion, I see a lot more get accomplished in a well organized meeting over some coffee than trying to keep tabs on an e-mail chain. It’s also good to look people in the eye when talking about deadlines. It’s also a lot easier to tell if someone is BS when you can see their face.
But interestingly enough, the reason why I’m most confident about not being outsourced is a simple fact I realized after beginning a job working with an incredible French programmer: programming is in English.
Look at this piece of Javascript, it doesn’t matter if you know JS or not:
$('.menu-item').hover(function(e) { e.preventDefault(); $('.menu-hover', this).stop(true).animate({ top: '0px' }, function(e) { $(this).animate({ backgroundColor: 'white' }, 'fast'); });
programmer and I still think it will be a tremendous skill to have in 2022. Whatever the world looks like then the web will be inevitably ubiquitous.
Code as art versus Code as a tool
I don’t have a deep history writing professional code, which sometimes makes me reluctant to write blog posts that don’t add some sort of new trick or fascinating javascript library I created. When I watch a seasoned expert craft a script, I’m still mystified as they pull functions and tools out of thin air that solve a problem in a clever 10 lines. As I improve as a programmer, I have noticed a dichotomy in programming that I find really fascinating: code as art vs code as a tool.
Before I began programming, when someone pulled up a source code on a web page or started running unix commands in a terminal window it was all gobbledegook. It was just random strings of punctuation marks I rarely use and abbreviations that I didn’t understand. Now, code appears in chunks, recognizable chunks with purpose I can discern. I don’t pinpoint the exact functioning, but instead get a feel for and make predictions about how data might be passed through a function. Everything is structured with a purpose and moving one comma out of place or forgetting to close a parentheses could break the machine, and everything would fail. To me, it appeared a profession with no tolerance for error.
I quickly found out that programming, and specifically programming for the web, is a mess. It’s filled with dead ends, variables with non-descript names, and weird work arounds. I regularly find myself knocking wonky elements off the screen or hiding them under another element, sometimes even hiding things in plain sight to create different illusions. Regularly, programmers refer to ways to “trick” the server or browser, like we are some sort of magician hiding our true intentions from the machine.
These goofy hacks create beautiful corners of the web, some with no purpose (this one has raptors, I highly recommend checking it out), or others provide comical and useful tools used every day to generate cross-browser CSS. I think code with syntax highlighting looks pretty cool on it’s own, even without any context.
On the other end, we have machine-like code optimized to the last character, silently performing their duty millions of times a day. These honed programs and snippets are not designed to be pretty, but to serve a narrow function and nothing more. They are trimmed of the fat of the human brain, minified into unreadable mashes of code.
Take for instance the Google Analytics tracking code. Don’t mind if you don’t understand the script, I don’t even know what most of it does. Wtf is gaq.
<script type=”text/javascript”>
var _gaq = _gaq || [];
_gaq.push([‘_setAccount’, ‘UA-XXXXX-X’]);
_gaq.push([‘_trackPageview’]);
(function() {
var ga = document.createElement(‘script’); ga.type = ‘text/javascript’; ga.async = true;
ga.src = (‘https:’ == document.location.protocol ? ‘https://ssl’ : ‘http://www’) + ‘.google-analytics.com/ga.js’;
var s = document.getElementsByTagName(‘script’)[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
In under 15 lines of javascript, Google opens up a window onto all of your visitors. You can see how long they stay, which country they are from, what out-dated browser they use. Obviously, there is a work-horse web-app powering Google Analytics, but I appreciate these super-optimized snippets.
Websites like CSS Zen Garden and WordPress like to analogize code to poetry, and I would have to agree. You must choose every character deliberately and remain focused on what you want to express. But at the end of the day, your code is converted into 01010101 and back in time frames that are nearly inconceivable and across oceans that used to take years to travel. Or hell, you could have written some of the millions of lines of C++ that landed with the Mars rover. What test cases do you even begin to prepare for on an alien planet?
How to create forms in WordPress? Overcoming your fear of code. Nevermind there’s a plugin for that.
A lot of HTML and CSS is self explanatory. HTML entities like <p>, <h1-6>, and <br /> are easy to wrap your head around. In CSS, the rules are similarly easy to decipher what they do. Background-image:url(‘filename.jpg’) is pretty clear, it sets the background image of an element and accepts the file name as it’s parameter. When we get to generating forms in HTML, I notice designers and developers who are learning the basics of the DOM find it to be confusing and understandably so.
Look at the code below:
<form name=”input” action=”html_form_action.php” method=”get”>
Username: <input type=”text” name=”user”>
<input type=”submit” value=”Submit”>
</form>
This is a very basic one line form submission, but as you read through the code you’ll notice some WTF elements of this that can be easily confusing. What does the “name” attribute even do? I thought we named elements using class= or id=? What’s the difference between setting the method to “get” or “post”? This also might be your first reference to a server side scripting language, so the action= attribute might be even more confusing.
While you could (and should) learn the syntax for creating forms, there are often easier solutions in WordPress.
How to create forms in WordPress using Plugins:
- Gravity Forms (Paid plugin)
- Contact Form 7 (free)
- Jetpack WordPress Plugin (free)
Gravity forms is worth every penny and if you are a WordPress developer I would recommend picking up a developer’s license. For free projects Contact Form 7 let’s you design and generate short codes to place throughout your site. I personally don’t use Jetpack but I’ve heard others speak highly of its wide feature set.
Check out this article from WP Bricks for some other perspectives on how to create forms in WordPress.
You get it. Here’s a polar bear.
I said in my last post that I was going to make a video of my new keyboard to show how clicky it is but you get it. It’s a clicky keyboard. It has been really nice and for the last few days I have been finding any excuse to type random strings of text into things just to feel the sensation of the mechanical keys. It also makes me feel productive to be making a bunch of noise.
Not to disappoint, here is a video of a polar bear learning how to walk.
Excited for my Mechanical Das keyboard
Whenever I purchase something it is usually an exhausting debate with myself about whether or not I should spend the money, if the product has sufficient value to justify a purchase, and if this is the best possible choice on the market. But every once and a while, I splurge. I dropped way too much money on something I plan to use hundreds of thousands if not millions of times in my lifetime: a mechanical Das Keyboard.
The advantages of having a mechanical switch keyboard are clear: reduced fatigue, increased accuracy, and greater words per minute. The mechanical switches are supposed to last approximately 10 times longer than dome membrane switches. They also provide that satisfying “clicky-ness” of older keyboards like the fabled IBM Model M that many people still swear by and use today.
My only concern is that in order to use the additional USB ports requires two USB inputs to the computer, taking up both of my USB ports on my Macbook Air. I’ll have to get a USB splitter, but that’s a small price to pay for typing nirvana.
When it arrives next Wednesday, i’ll record a video of how loud it is with my setup. (ninja edit. Screw your video watch the polar bear.)
Reading, writing, and coding
This is just a brief introductory post for my blog. Here I will write about things from programming and WordPress to politics and video games. Everything I say on this blog is solely my opinion and if that pisses you off, write some really offensive comments below.
California medical marijuana supreme court ruling undermines everything everybody wants.
I found myself enraged today. How could 7 of our most highly esteemed justices be so ignorant of reality? How can 7 highly intelligent individuals completely ignore peer reviewed science, compelling logic and reasonable deduction?
When the California Supreme court asserted that local governments can ban the distribution, sale, and storage of medical marijuana they asserted certain facts I find genuinely confusing.
1. Marijuana is prohibited by the federal government via the Controlled Substance Act.
2. The state of California has the right, via the ballot initiative process, to refuse to enforce the Controlled Substance Act for individuals who have a recommendation from their doctor to use marijuana.
3. Local governments can prohibit all aspects of the production (this includes storefront collectives, delivery services, and private co-operatives) of medical marijuana should they deem those operations to be a nuisance without providing any criteria for what constitutes a nuisance.
Now reality steps in:
4. People who severely need marijuana will get marijuana from other sources.
5. People who moderately need marijuana will get marijuana from other sources.
6. People who want marijuana will get marijuana from other sources.
And who will they all get their marijuana from now? Drug dealers! Drug dealers have cocaine, meth, ecstasy, and really could care less how old you are, where their drugs came from (cartels and gangs), and most likely encourage users onto their harder products where their profit margins are much higher.
See, prohibitionists and reform activists like myself share a common goal: let’s reduce the harm caused by these drugs. Let’s prevent them from getting into the hands of our children and let’s take every effective step at reducing the violence associated with the illegal drug trade. Today’s ruling, does the exact opposite of all of our goals. It makes our communities less safe, empowers violent criminals, and exposes our children because of our society’s inability to deal with itself.
Let’s let science, reason, and compassion dictate our drug policy. We learned our lesson with alcohol prohibition. This was a bad decision.