testsetset
Google today is publishing for the first time its internal documentation for how and when to open-source its own technology. Bits and pieces of this information have come out over the years, but now it’s all available for everyone to read, under a Creative Commons license.
At the same time, Google is also posting a catalog of its open source projects, along with descriptions of how Google itself uses the tools. Simply put, there has never been anything like that available before. While many of Google’s open source initiatives primarily live on GitHub, that’s not true of all of them — Android and Chromium, for example, are maintained on Google servers. Now it’s easy to browse and search.
Google’s attitude toward making code available for free to anyone has fluctuated over the years, Will Norris, a software engineer working in Google’s Open Source Programs Office, told VentureBeat in an interview. But generally speaking, Norris said, “our philosophy toward open source is we’re OK publishing something publicly unless there’s just a really good reason not to. Our default is, ‘Sure, why not?'”
Often, an open source project emerges because a Googler or gaggle of Googlers enjoyed building something and wanted to share it, Norris said. There isn’t always a direct business reason, but that’s OK — it probably doesn’t hurt if Google technologies gain adoption outside Google’s doors, and seeing who works on the open source projects could help Google find the brightest talent to go after, too.
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
Open source software is nothing new at Google. The company has operated an open source office for 12 or 13 years and has been involved in open source software communities since its earliest days, Norris said.
Microsoft in recent years has stepped up its open source activity, while Facebook, LinkedIn, Pinterest, Twitter, and other large software companies regularly release open source software.
Google’s popular open source releases in the past two years include the Go programming language, the Kubernetes container management system, and the TensorFlow deep learning framework. On GitHub alone, Google has more than 4,000 repositories and 100 separate organizations, Norris said.
Some software is more widely consumed inside Google than others. Google uses TensorFlow for Android, Gmail, Google Maps, Google Photos, Google Play, Google Search, speech recognition, Google Translate, and YouTube, according to the website for that project. Google employs the Lovefield relational database, meanwhile, for Google Play Movies & TV, Google Play Music, and Inbox by Gmail.
The documentation itself is fascinating, too. It says that among other things Google checks to make sure that open source candidates don’t have links to “Google’s ‘secret sauce,'” will help users instead of just competitors, and are clear of legal, patent, and privacy issues. Google’s default license is Apache 2.0, but employees are welcome to inquire about whether they need to use a different one.
Google tells employees to take out names and email addresses of Googlers inside source code unless they’ve agreed to have their information included. Code names and internal paths to files shouldn’t be included, either. Google even suggests a command — egrep -r '\.google\.com|@google\.com|google3?/|([0-9]+\.){3}[0-9]+'
— for quickly locating email addresses.
The documentation also reveals Google’s policies for its employees posting work on github.com, which is of course maintained by the independent startup GitHub. Google suggests that employees keep using their current personal GitHub accounts rather than setting up new work-related GitHub accounts, and it directs employees to tie accounts to their google.com email addresses to “ensure that your commits don’t get flagged as needing a CLA [Contributor License Agreement].”
Google doesn’t want employees creating new GitHub organizations for projects — it already has plenty. It does want employees using two-factor authentication when signing in to GitHub, though. Non-Googlers contributing to Google projects can be marked as collaborators but generally not admins. And Google recommends that if a person who’s leaving is the only one working on an open source project and no team will be assuming responsibility, the person should fork the repository.
The documentation also provides insight into how Google deals with personal projects. Google lets employees in good standing request consideration by an Invention Assignment Review Committee (IARC) to get a copyright release for their work. Notably, getting IARC approval doesn’t mean that employees also receive patent rights or other intellectual property (IP) rights.
A blog post on the news is here.