Skip to content

LLM training data, or, a broken virtuous cycle

Have you used ChatGPT?

I have and it’s  amazing.

From the whimsical (I asked it to write a sonnet about SAML and OIDC) to the helpful (I asked for an example of a script with async calls using Typescript), it’s impressive.

However, one issue I haven’t seen mentioned is training data. Right now, there are sets of training data used to teach these large language models (LLMs) what is correct about the world and what is not. This podcast is a great intro to the model families and training process.

But where does that training data come from? I mentioned that question here, but the answer is humans provide it. Human effort and knowledge are gathered on reddit, wikipedia, and other places.

Why did humans spend so much time and effort publishing that knowledge? Lots of reasons, but some include:

  • Making money (establishing yourself as an expert can lead to jobs and consulting)
  • Helping other humans (feels good)
  • Internet points (also feels good)

In each case, the human contributing is acknowledged in some way. Maybe not by the end user who doesn’t, for example, read through the Wikipedia wiki editing history. But someone knows. Wikipedia editors know and celebrate each other. Here’s a list of folks who have edited that site for a decade or more.

What about search engines? Google reifies knowledge in a manner similar to ChatGPT. But, cards notwithstanding, Google offers a reputational reward to publishers. It may be in money (Adwords) or site authority. Other applications like Ahrefs help you understand that authority and I can tell you as a devrel, high search engine ranking is valuable.

ChatGPT offers none of that, at least not out of the box. You can ask for links to sources, but the end user must choose to do so. I doubt most do, and, in my minimal experience, the links are often broken or made up.

This fact breaks the fundamental virtuous cycle of internet knowledge sharing.

Before, with search engines:

  • Publisher/author writes good stuff
  • Search engine discovers it
  • User reads/enjoys/learns from it on the publishers site
  • Publisher/author gains value, so publishes more
  • Search engine “sees” people are enjoying publisher, so promotes it
  • More users read it
  • Back to step one

After, with LLMs:

  • Publisher writes good stuff
  • LLM trains on it
  • User reads/enjoys/learns from it via ChatGPT
  • … crickets …

The feedback loop is broken.

Now, some say that the feedback loop is already broken because Google over optimized Adwords. Content farms, SEO focused garbage sites and tricks to rank are hard to stomach, but they do make money from Google’s traffic. This is especially acute with products and product reviews because the path to monetization is so clear; end users are looking to buy and being on page 1 will result in money. I agree with this critique; I’m not sure the current knowledge sharing experience is optimal, but humans have been working around Google’s limitations.

More human labor helps with this. I’ve seen this happen in two ways, especially around products.

  • Social media, where searchers are relying on curation from experts. Here end users aren’t searching so much as browsing from a subset of answers.
  • Reddit, where searchers are relying on the moderators and groups of redditors to keep spam out of the system. Who among us hasn’t searched for “<product name> review reddit” to avoid trash SEO sites? This also works with other sites like Stackoverflow (for programming expertise).

In contrast, the knowledge product disintermediation of ChatGPT is complete. I’ll never know who helped me with Typescript. Perhaps I can’t know, because it was one million little pieces of data all broken up and coalesced by the magic of matrix algebra.

This will work fine for now, because a large corpus of training data is out there and available. But will it work forever? I dunno. The cycle has been broken, and we will eventually feel the effects.

In the short term, I predict that within the next three months, there will be a creative commons type license which prohibits the usage of published content by LLMs.

Using GPT to automate translation of locale messages files

At my current employer, FusionAuth, we have extracted out all the user facing messages to properties files. These files are maintained by the community, and cover over fifteen languages.

We maintain the English language version. Whenever new user facing messages are added, the properties file is updated. Sometimes, the community contributed messages files are out of date.

In addition, there are a number of common languages that we simply haven’t had a community member offer a translation for.

These include:

  • Korean (80M speakers)
  • Hindi (691M)
  • Punjabi (113M)
  • Greek (13.5M)
  • Many others

(All numbers from Wikipedia.)

While I have some doubts and concerns about AI, I have been using ChatGPT for personal projects and thought it would be interesting to use OpenAI APIs to automate translation of these properties files.

I threw together some ruby code, using ruby-openai, the ruby OpenAI community library that had been updated most recently.

I also used ChatGPT for a couple of programming queries (“how do I load a properties file into a ruby hash”) because, in for a penny, in for a pound.

The program

Here’s the results:

require "openai"
key = "...KEY..."

client = key)

def properties_to_hash(file_path)
  properties = {}, "r") do |f|
    f.each_line do |line|
      line = line.strip
      next if line.empty? || line.start_with?("#")
      key, value = line.split("=", 2)
      properties[key] = value

def hash_to_properties(hash, file_path), "w") do |file|
    hash.each do |key, value|

def build_translation(properties_in, properties_out, errkeys, language, client)
  properties_in.each do |key, value|
    sleep 1
# puts "# translating #{key}"
    message = value
    content = "Translate the message '#{message}' into #{language}"
    response =
      parameters: {
        model: "gpt-3.5-turbo", # Required.
        messages: [{ role: "user", content: content}], # Required.
        temperature: 0.7,
    if not response["error"].nil?
      errkeys << key #puts response 

    if response["error"].nil? 
      translated_val = response.dig("choices", 0, "message", "content") 
      properties_out[key] = translated_val 
      puts "#{key}=#{translated_val}" 

# start the actual translation 
file_path = "" 
properties = properties_to_hash(file_path) 
#puts properties.inspect 
properties_hi = {} 
language = "Hindi" 
errkeys = [] 

build_translation(properties, properties_hi, errkeys, language, client) 
puts "# errkeys has length: " + errkeys.length.to_s 

while errkeys.length > 0
# retry again with keys that errored before
  newprops = {}
  errkeys.each do |key|
    newprops[key] = properties[key]

  # reset errkeys
  errkeys = []

  build_translation(newprops, properties_hi, errkeys, language, client)
  # puts "# errkeys has length: " + errkeys.length.to_s

# save file
hash_to_properties(properties_hi, "")

More about the program

This script translates 482 English messages into a different language. It takes about 28 minutes to run. 8 minutes of that are the sleep statement, of which more below. To run this, I signed up for an OpenAI key and a paid plan. The total cost was about $0.02.

I tested it with two languages, French and Hindi. I used French because we have a community provided French translation. Therefore, I was able to spot check messages against that. There was a lot of overlap and similarity. I also used Google Translate to check where they differed, and GPT seemed to be more in keeping with the English than the community translation.

I can definitely see places to improve this script. For one, I could augment it with a set of loops over different languages, letting me support five or ten more languages with one execution. I also had the messages file present in my current directory, but using ruby to retrieve them from GitHub or running this code in the cloned project would be easy.

The output occasionally needed to be reviewed and edited. Here’s an example:

[blank]=आवश्यक (āvaśyak)
[blocked]=अनुमति नहीं है (Anumati nahi hai)
[confirm]=पुष्टि करें (Pushṭi karen)

Now, I’m no expert on Hindi, but I believe I should remove the English/Latin letters above. One option would be to exclude certain keys or to refine the prompt I provided. Another would be to find someone who knows Hindi who could review it.

About that sleep call. I built it in because in my initial attempt, I saw error messages from the OpenAI API and was trying to slow down my requests so as not to trigger that. I didn’t dig too deep into the reason for the below exception; at first glance it appears to be a networking issue.

C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:219:in `rbuf_fill': Net::ReadTimeout with #<TCPSocket:(closed)> (Net::ReadTimeout)
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:193:in `readuntil'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:203:in `readline'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:42:in `read_status_line'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:31:in `read_new'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1609:in `block in transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `catch'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1573:in `request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1566:in `block in request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:985:in `start'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1564:in `request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty/request.rb:156:in `perform'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:612:in `perform_request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:542:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:649:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:63:in `json_post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:11:in `chat'
        from translate.rb:33:in `block in build_translation'
        from translate.rb:28:in `each'
        from translate.rb:28:in `build_translation'
        from translate.rb:60:in `

(Yes, I’m on Windows, don’t hate.)

Given this was a quick and dirty program, I added the sleep call, but then, later, added the while errkeys.length > 0 loop, which should help recover from any network issues. I’ll probably remove the sleep in the future.

I signed up for a paid account because I was receiving “quota exceeded” messages. To their credit, they have some great billing features. I was able to limit my monthly spend to $10, an amount I feel comfortable with.

As I mentioned above, translating every message into Hindi using GPT-3.5 cost about $0.02. Well worth it.

I used GPT-3.5 because GPT-4 was only in beta when I wrote this code. I didn’t spend too much time mulling that over, but it would be interesting to see if GPT4 is materially better at this task.


Translating these messages was a great exploration of the power of the OpenAI API, but I think it was also a great illustration of this tweet.

I had to determine what the problem was, and how to get the data into the model, and how to pull it out. As Reid Hoffman says in Impromptu, GPT was a great undergraduate assistant, but no professor.

Could I have dumped the entire properties file into ChatGPT and asked for a translation? I tried a couple of times and it timed out. When I shortened the number of messages, I was unable to figure out how to get it to ignore comments in the file.

One of my other worries is around licensing. I’m not alone. This is prototype code running on my personal laptop and the license for all the localization properties files is Apache2. But even with that, I’m not sure my company would integrate this process given the unknown legal ramifications of using OpenAI GPT models.

In conclusion

OpenAI APIs expose large language models and make them easy to integrate into your application. They are a super powerful tool, but I’m not sure where they fit into the legal landscape. Where have we heard that before?

Definitely worth exploring more.

How to have great guest posts on a blog

I have another blog that I’ve been running for a few years, called Letters to a New Developer.

It is full of advice I wish I had had at the beginning of my software development career. I even turned it into a book.

One thing I’ve done the entire time I’ve been writing this is to ask for guest posts. No advice is generic; it’s all based on where and who you are. By asking for guests to share their experiences, I expose my readers to a wider variety of viewpoints. This leads to better understanding of the wide world of software.

After all, if you want to work for a FAANG, it’s better if you hear from someone who prepped, interviewed and was hired for such a company, rather than me, who will never work for such a company. Or if you are looking to understand what it’s like to work in open source, hearing from someone who has made it their career. Have imposter syndrome? Others have struggled with it too. Also works for techniques; it’s great to hear from someone who is a methodical journaler, for example.

I bet you get the point. There’s no way I could have written any of those articles.

Guest posting also helps spread the word about my blog, since a guest poster usually shares their writing with their friends and network.

For the right kind of blog, guest posts can be great. Here’s my criteria:

  • Does the blog have decent traffic? If not, folks probably won’t want to guest post. I’d say at least 25 visits a day.
  • Have you done this for a while? A years worth of regular posting shows you’re serious.
  • Does the blog present a variety of viewpoints? I wouldn’t want a guest post for this blog, because it’s all me, all the time.
  • Is the blog non-corporate? If it is a company blog, guest posts should be paid for with money, or be part of a co-marketing project.

If you want to have great guest posters, here’s my recipe for finding them and fostering them:

  • Don’t accept inbounds, unless you know the person or can vet what else they’ve written.
  • Don’t accept any content that is off-topic.
  • Set a target guest. For me, it is someone who is or was a developer with different viewpoint and perspective. I was especially interested in highlighting voices of people who were not white men.
  • Keep an eye out for interesting posts or interviews. Reach out to the authors and see if they might be game to guest post.
  • Be okay with a cross post. I’d say about 60% of my guest posts are original content, but more recently authors want to post the content on their blog first. Offer a link back.
  • Be okay with a repost, especially if the author is prominent. Review existing content and find one or two articles that fit with your blog’s theme. Ask if you can re-publish. Offer a link back.
  • Some folks will say no. They can do so for a variety of reasons (don’t want to write, don’t syndicate their content, etc), and you have to be okay with it.
  • If someone says “I can’t right now, sorry.” reply with something like “Totally get it. I’ll follow up in 6 months, unless you’d rather I didn’t.” Then, follow up. Sometimes they’ll ignore you, sometimes they’ll have more time.
  • If they are writing original content for you, offer to edit it. Find out how much editing they want. I’ve over-edited a few times and that’s a bad thing if someone is volunteering their time and knowledge.
  • If they ask about topics, advise them to focus. Lots of people want to write about “5 things you need to do” but in my experience, articles with focus on one topic are better received.
  • Write up guidelines so you can easily share. This can include audience, benefits, formatting, delivery mechanism, word count, topic ideas, and more.

Hope this helps. Happy guest blogging!

Why Developer Relations Jobs at Startups are Not Entry Level Positions

Developer relations jobs at startups are sometimes seen as suitable for entry-level developers, but this is a common misconception. In reality, these positions require a diverse skill set that can be overwhelming for new engineers.

The Variety of Tasks

Developer relations jobs at startups require a variety of tasks, including writing documentation, creating tutorials, writing example apps, giving presentations, and attending events. You’ll need to be able to prioritize these tasks, drive them to completion (often with the help of other startup colleagues) and re-order them based on changing needs and input.

This is all on top of the normal chaos at a startup, when you will be either searching for product market fit, pivoting or scaling. The combination of these tasks can be overwhelming for new engineers who are still learning the ropes.

The Solitude

This means that you need to be able to navigate the often chaotic environment of a startup on your own, without much in the way of project guidance or career development. You will need to rely on yourself and your own skills to be successful. This can be intimidating for new engineers who are just starting out their careers, but with the right attitude and knowledge it is possible to thrive.

Credibility with Developers

Developer relations jobs at startups require credibility with both beginner and expert developers. This means that you need to be able to communicate effectively with developers who are just starting out, as well as those who are experienced in the field.

While entry level developers can empathize with other beginning developers, technically connecting with experienced developers is a tall order. This requires a deep understanding of the technology and the ability to explain it in a way that is accessible to everyone.

Credibility with Founders

Developer relations is a long game, so you need to have buy-in from the founders of the company. They need to invest in you, in the community, and in your activities that strengthen the company’s ties to the community. This requires a deep understanding of the technology and an ability to communicate effectively with developers of all levels. It also requires demonstrating technical prowess and expertise that will be respected by the founders so they can trust that their investment in you and your activities will pay off. Having a few years of experience under your belt can help build this credibility, as it shows that you are familiar with the technology and have built up a base of knowledge that can be used to benefit the company.

Learning New Technologies

Developer relations jobs at startups require you to get up to speed quickly on a variety of technologies. This means that you need to be able to learn new things quickly and be able to explain them to others.

You need to be able to understand the abstractions your product sits upon, as well as the tools it can integrate with. The more experience you have with different technologies, the easier this is. Conversely, this can be a daunting task for new engineers who are still learning the basics.

Foundation of Production Level Code Experience

Finally, developer relations jobs at startups require a foundation of production code experience. This is because you need to be able to connect with developers who will be evaluating your tool.

You need to be able to speak their language and understand their needs. This requires a deep understanding of the technology and the ability to write production quality code.

Where This Advice Doesn’t Apply

If you are joining a larger startup or company with an established developer relations teams, much of this post does not apply. In this case, you’ll have founder buy-in, support from team members, and more defined tasks.

You may still have trouble connecting to experienced developers with complex questions, but may be able to connect them to other team members who can help them.

What To Do

If you are interested in developer relations, play the long game with your career. Spend a few years as a software developer, working on a team shipping code that users will enjoy. Learn how developers think and approach problems.

You can also engage in the communities that are important to you, either online (slacks, reddit, hackernews, etc) or in-person (conferences, meetups, etc). Volunteer or speak at events, which can help you understand the nuts and bolts of what goes on.

After a couple of years, you’ll have the software engineering foundation as well as the community experience to set yourself up for a fantastic devrel career.

In Conclusion

In conclusion, developer relations jobs at startups are simply not entry-level positions. They require a diverse skill set that can be overwhelming for new engineers. They require credibility with both beginner and expert developers, the ability to get up to speed quickly on a variety of technologies, and a foundation of production level code experience.


Multi-tenancy options

Multi-tenancy is a key part of building a SaaS product. You want to amortize your software investment across different paying customers. Customers should never be able to access any other customer’s data. And it is often the case that customers don’t want their data intermingled with other customers’ data, though that depends on the type of data and the customer needs.

By separating customers into tenants, you can achieve this data separation. There are a number of levels of multi-tenancy. In increasing order of isolation, they are:

  • no multi-tenancy. In this situation, all the customer data is co-mingled in the database. Any data that is unique to a customer is tied to that customer with an id.
  • logical multi-tenancy. Isolation is enforced in code and the database (every table has a ‘tenant id’ key, and you’re running joins). When you are using this type of isolation, you want to resolve the tenant as soon as possible. This is often done with a different hostname or path. You also want to ensure that any users accessing tenant data are part of the tenant. If you use this approach, mistakes in your code can be used to ‘escape’ the tenant limitation and read other customer’s data. However, one advantage of this approach is that you have operational simplicity: one version of the code to maintain and one version of the database. There may be support for this in your framework, or you may be rolling your own.
  • logical multi-tenancy supported by the database. Some databases support row level security isolation. In this scenario, each tenant has a different user, and the data isolation is enforced by the database. Your code is limited to looking up the correct user for a given tenant.
  • container level multi-tenancy. In this scenario, you run separate containers for each tenant. If you are using a solution like Kubernetes, you can run them in different namespaces to increase the isolation. The operational complexity increases (I did mention Kubernetes, did I not?) but it becomes far more difficult for an attacker to use the access of one tenant to get another tenant’s data. However, now you can have multiple versions of the codebase running. This can be a blessing and a curse, as it allows each client to control their version (if you enable it). This can increase support burden depending on the complexity of your application. You could also choose to run the latest code on every container, upgrading all containers every time a change is made to your software.
  • virtual machine multi-tenancy. Here you use different databases and virtual machines for each tenant. You can leverage common security defense-in-depth practices at the network level, using network access controls and firewalls. This physical isolation makes it even harder for an attacker to escape and view other tenants’ data. However, it increases your operational costs both in terms of complexity (are you going to force everyone to upgrade across the entire fleet?) and support (there may be configuration and/or code drift between the different VMs). If you pursue this, it behooves you to automate the creation of these virtual machines.
  • physical hardware isolation. With this choice, you actually run different hardware for each tenant, possibly in different data centers. This is the most secure, but the most operationally intensive. There are some options for API driven hardware setup, but the isolation, while a boon for security, makes updates and upgrades more difficult.

What is the best option for your SaaS solution? It depends on the security needs of your customers as well as your cost structure and your operational maturity. The higher the level of isolation, the harder it is to run and upgrade the various systems.

Books and other resources to level up as a software developer

A while ago there was an HN post asking for suggestions on [r]eading material on how to be a better software engineer.

Here’s my list based in part off a comment I made there.

First, books:

  • Secrets of Consulting by Gerald Weinberg because every problem is a people problem.
  • Refactoring by Martin Fowler et. al. discusses how and why to refactor, as well as providing a nomenclature for the process.
  • Code Complete by Steve McConnell is a bit dated (the last version I could find was from 2004) but a great overview of the entire software process, from requirements to maintenance.
  • The Mythical Man-Month by Fred Brookes covers best practices about software development, written about a project from the 1960 and 1970s. Nothing new under the sun.
  • The Joel On Software Strategy Letters cover different aspects of software strategy. The link is the first one, but all of them (I think there are five) are great.
  • Letters to a New Developer is a collection of essays helpful to new developers. Note I wrote this book, but I think it does a good job of discussing the “soft skills” in an easily digestible format.
  • The Pragmatic Programmer by Dave Thomas and Andy Hunt. I haven’t read the revised 20th anniversary edition, but the first one opened my eyes to the craft of software.
  • High Output Management by Andy Grove illustrates how think about throughput.
  • The Phoenix Project by Gene Kim et. al. is a fun novel(!) about applying lean management principles to software engineering.
  • Good to Great by Jim Collins focuses on what great companies bring to the table. Helps me evaluate where to work.
  • Managing Humans by Michael Lopp. The whole Rands site is worth reading, but I enjoyed this book about how to manage teams and build software. See above.
  • Don’t Make Me Think by Steve Krug shows ways to think about usability, focusing on webapps. Short and easy.
  • Badass: Making Users Awesome, by Kathy Sierra helps you put yourself in the shoes of your users and think about how to build software they will love. Short and easy.

Then, podcasts and videos:

  • Mastery Autonomy and Purpose, a great video about what people really want in work.
  • The Manager’s Toolbox podcast, focuses on nuts and bolts skills for managing people. You didn’t say you wanted to be a people manager, but knowing what managers think about will make you more effective in any org.
  • Screaming in the Cloud is useful for keeping up with AWS and other cloud provider offerings.
  • SE Radio is a bit dry, but has a great back catalog of software engineering focused episodes.

A few other resources:

  • The Rands Leadership Slack has over 10,000 engineering leaders discussing all kinds of software related topics.
  • CTO lunches is an email list of engineering leaders. The discussions aren’t consistent, but when they happen, they’re great. Plus, it comes to your email.
  • HackerNews is a great way to burn time, but also a great way to keep on top of topics that are top of mind of some of the best developers in the world.

Reading up on software practices can help you level up as a software engineer because you’ll be able to avoid mistakes others have made before you. I can also offer a view of the big picture; knowing how your software helps your organization or company will only make you more valuable as a developer.

Consulting tips

If you are thinking about making the move from being an employee to consulting, I have some thoughts. First, there is a difference between contracting and consulting. Contractors are paid for what they do, consultants are paid for what (and who) they know. I have found it’s a lot easier to get contracting work than consulting work. (Of course, sometimes they blur together.)

If you are focusing on consulting, think about:

What is your source of work? Finding work is a big part of consulting. Options include:

  • your network (it’s fine to hit folks up and say “I’m doing X, do you know anyone who needs it?”)
  • past client
  • courses
  • a community presence
  • books
  • blog/writing
  • advertising (though this can be easy to waste money on if you aren’t targeted)

I’m sure there are others. Think about and nurture this pipeline. Always ask happy clients for referrals. This is something you typically don’t think about as an employee (it’s someone else’s job). It’s part of your job as a consultant.

How will you get paid? This is the other piece of consulting that people jumping from employment don’t consider enough.

What will your terms be? Have a standard contract. Have buffer in the bank because there will be lean months; prepare for “feast or famine”. You’ll also need to be prepared to chase down payment. Doesn’t happen often and you can prioritize companies that pay promptly, but with what feels like an economic rough period ahead, companies will be stretching payment terms, esp to outside consultants.

Don’t expect to be paid when you do the work, 30 days net is typical. Use a solution like freshbooks (what I used, years ago, there may be better ones) to automate your invoices and possibly invoice nagging.

Sometimes you will be paid to do things that seem dumb and/or below your pay grade. Raise your concerns, but if you are told to continue, smile and do it. One of the joys of consulting is you are not really “one of the team” which can give you healthy separation from org problems you aren’t trying to solve.

You are not really “one of the team” with any of your clients. Esp if you are a one person show, this can be lonely sometimes if you are used to socializing with your coworkers. Not that they won’t invite you to lunch, etc, but there’s always an awareness that you are a hired gun rather than someone who is on the team. (They might try to recruit you, though. Be prepared for that.)

Taxes/ownership structure/insurance become a bigger thing. Find a CPA, preferably through a referral, and understand and set up the proper ownership structure and tax payments. Typically your tax burden will be higher, but you can write off more stuff as a business expense. You may need some kind of business insurance depending on who your clients are (Oracle made me get e&o, I carried general liability for a while). The nolo books are good on this if you want to get smart on your own, but you’ll really want a CPA in your court.

Think about if you want to be a one-person show or build a team? I personally never got past the one person show stage (though tried subcontractors a few times) because I didn’t want to manage and I wanted to do the work. If you are building a team, you will need to focus even more on bringing in work rather than doing it.

Consulting can be fun because you have control of your work and you get to work in different environments without switching jobs. It’s also more stable to have a few purchasers for your labor than just one (an employer). Think about what kind of work you want to specialize in, because the temptation (esp if cash flow is low) will be to take anything you can do (or maybe things you think you can do).

Spend time on professional development. That could be during the work day or after hours. Sometimes clients that know you will pay for you to learn something, but that is not typical in my experience.

If you have the time, create a course or ebook in the domain you are planning to consult in. This can give you some income, but the real benefit is to say “I wrote the book on X” or “I did a course on X”.

Think about passiveish income options. Courses or ebooks (as above) can offer that. So can productized services, web hosting, or access to a tool that codifies your knowledge. Don’t focus on this when you start, but having something like this will help buffer your cash flow.

Prepare to raise your rates yearly and with every new client. No one is going to give you a raise, you have to ask for one. I usually said something like “my new rate for next year is going to be X/hr, please let me know if you have any questions” in an email around this time every year. Be prepared for some clients to not be able to afford you, and to part ways.

Finally, some book recommendations:

  • The Secrets of Consulting is required reading, covers about the people side of consulting which is really critical.
  • Value Based Fees: How to charge value based fees rather than time and materials/hourly/daily. I never had the guts to do this, but it was eye opening to read about.

GitHub Actions Are Amazingly Easy

GitHub Workflows are automated jobs that can be triggered by various events against a GitHub repository. They are pretty awesome.

GitHub Actions are a way to encapsulate configuration and functionality in a way that can be easily reused in GitHub Workflows.

I was thinking it’d be fun to create some GitHub Actions (yes, I’m the life of the party), so I sat down a few mornings ago to do this. I was shocked at how easy it was.

I followed a few lines of this tutorial to create a workflow. Then I created an action by following this tutorial. Finally, I edited my workflow to use the new action. That was it.

It was amazingly simple and took me about 30 minutes. I ran into one unrelated issue (to set the executable bit on a shell script in windows, I had to modify the shell script contents in order to ensure the change was sent to the remote repo).

If you take a look, you’ll see these are both toy repositories, to be sure. However, the ability to write jobs which will be executed on a git push, pull request or other events is great and removes toil. Being able to extract common functionality to an action is even better. Finally, the ability to share the action publicly by adding it to the GitHub marketplace is fantastic.

I’ve liked CircleCI for a long time, but if I were them I’d be worried.

One issue I found is that the testing/release cycle is pretty tedious (I’ve mentioned that action debugging to be an issue for a while).

While I was troubleshooting my executable bit error, I had to do the following every time I wanted to test a change:

  • make a change in the action repository
  • create a new tag
  • push it to the remote
  • switch to the workflow repository
  • bump the action version
  • push to the remote
  • wait for the workflow to complete

Not horrific, but pretty tedious. I don’t know if there are other options such as local deployment which would reduce that cycle, but that would be swell.

Other than that, 10 out of 10, would write more actions.

How To Start Improving a Legacy App

An interesting question appeared on HN recently: “Ask HN: Inherited the worst code and tech team I have ever seen. How to fix it?

You can read that post and the answers there. I’m going to address a related, but different question in this post.

If you run encounter an application that has tremendous business value and yet is not following any modern software management processes, what are concrete steps you can take to help improve the application?

That is, what if you have (or are hired to be responsible  for) an app like the poster of the HN thread:

  • making plenty of money for the company
  • no version control
  • old school structure
  • a ball of mud architecture
  • no code deleted, just commented out
  • multiple versions of libraries on the front and back end
  • etc

First off, you need to convince someone that it is going to be worthwhile to invest in this process. If you can’t do that, you are dead in the water. So look for the pain points that occur when best practices are lacking:

  • slow delivery of features
  • catastrophic bugs which lose money, hurt the brand or impact data
  • talent hard to hire
  • hard to improve the application

If you can pinpoint pain caused by the app, you can start to build a case to improve it.

If you can’t, well then, maybe you shouldn’t touch it. If it ain’t broke, don’t fix it!

With that said, here’s my list of what to implement, in rough priority order. Don’t worry about best of breed for the tools, just pick what the company uses. If the tool isn’t in use at the company, pick something you and the team are familiar with. If there is nothing in that set, pick the industry standard. I include a recommendation for the latter.

1. Get the app under version control. Git is best if you don’t have any existing solution. GitHub or GitLab are great places to store your git repositories.

2. Start up a bug tracker. You have to have a place to keep track of all issues. GitHub issues is adequate, but there are a ton of options. This would be an awesome place to get buy-in from the existing team about whichever one they prefer. The truth it is doesn’t matter which particular bug tracker you use, just that you use one.

3. A way to get one click or zero click deploys. A SaaS tool like CircleCI, GitHub actions is fine. If you require “on prem”, Jenkins is a fine place to start. But you want to be able to deploy changes quickly.

4. Set up a staging environment. With one, you can manually test changes and debug issues without affecting production. Building this will also give you confidence that you understand how the system is deployed. Then you can can include that in the build tool process.

5. Unit and system/end to end testing. End to end testing can give you confidence in changes. However, it is overwhelming to add testing to an existing large, crufty codebase. I’d focus on two things: unit testing some of the weird logic; this is a relatively quick win. Second, setting up at least one or two end to end tests through core flows (login, purchase path, etc). In my experience, setting up the first instance of each of these is the toughest process, then it gets progressively easier. There’s usually an ‘xUnit’ framework in any language for unit testing. Look for that. I’m not sure what best practice is in end to end testing, but selenium or cypress are good for browser based applications.

6. Capture documentation. This might be a higher priority, depending on what your relationship with the existing team is. Few teams will say no to someone helping out with doc. Document high level architecture, deployment processes, key APIs, interfaces, data stores, and more. Capture this in google docs or a wiki if you don’t have an existing solution.

7. Start using data migrations. Having some way to automatically roll database changes forward and back is a huge help for moving faster.

None of these are about changing the code (except maybe the last one), but they all wrap the code in a blanket of safety.

After implementing one or more of these, the team should be able to move faster and with more confidence. This will build trust and allow you to suggest bigger changes, such as bringing in a framework or building abstraction layers.

What in-person conferences offer: feedback

I was listening to a Twitter space recently and the host had an interesting take: for the amount of money you would spend flying a speaker to an international conference (call it $5000, though of course the actual number varies depending on location, timing and more), you could record a great educational video and get it in front of many folks on Youtube.

Assume you spend $3000 on video production, and the CPM is $4 (hard to find solid numbers, but this post talks about rates in that range), you could put that video in front of half of a million people (1000 views/$4 * $2000). That’s a big number, certainly more than attend any conference.

As someone who is gingerly stepping back into work conference travel, who doesn’t like to spend time away from his home, who is a member of the flyless community, and who came into devrel in earnest during the pandemic, I’m sympathetic to that view point. It is more efficient to broadcast your message wholesale, whether that is with a blog post, webinar, or a video, than it is to talk to people retail at a conference booth, or even to give a talk to a hundred people.

But what I’ve learned is that there are real benefits to in-person conferences too: attention, prestige and feedback.


Think back to the last video you watched, especially if it was technical. How much of your attention did you give it? Perhaps 100% if following a tutorial. But perhaps substantially less if it was background noise or you were looking to learn a bit on the subject.

I’ve definitely “attended” online conferences where I was not paying attention. And I have never popped into a virtual conference “booth”, so I have no idea if the content there is compelling.

I’ve also seen folks at in-person talks on their phones or computers, to be sure, but it is not the rule.

Data is hard to come by, but I believe that folks that are more likely to pay attention at an in-person event. They have made more effort, so they are more committed (research finds “working hard can also make [things] more valuable”). There’s also more distance from the normal work task during an in-person conference. Attendees have far fewer distractions, and an expectation of attention. I think that focusing your attention on a speaker at a talk you are attending is the polite thing to do as well, and there are social norms pressuring folks to do that.

This attention makes an attendee at an in-person conference more valuable than a Youtube viewer.


While not anyone can create a great video, anyone with a camera can make a video.

On the other hand, not everyone can buy a booth at a conference, attend one, or speak. There is a filter on everyone who is at an in-person conference. This filter disadvantages folks who can’t travel, have a disability, or have other constraints. But it improves the value of an interaction at a conference too.

Being able to pay for a booth or have a talk accepted in particular are signals of quality. They don’t equate with quality, as anyone who has sat through a vendorware conference presentation can attest, but there is some level of prestige that accrues to an organization by being at a conference. That’s one of the reasons companies pay to sponsor conferences; there’s value in being seen there. (Others might phrase it differently.)


Feedback is the last, and in my mind, most valuable differentiator between in-person conferences and online educational activities.

At a conference, the opportunity for two way communication abounds!

Any time someone stops by a booth or asks a question after a talk, as an educator you have the opportunity to not just answer a question or address a comment, but to dig in and understand the person’s context. What do they do? Why are they asking that particular question? Is there an unstated assumption in their question?

You can and often do have ten minute conversations at a booth, and this qualitative, high bandwidth feedback from expensive software development professionals is valuable in learning about your market and seeing if your message resonates.

Contrast that with the limited q&a at an online conference or the comments on a video. Yes, that is also feedback, but it is far less nuanced, considered, and interactive.


The ready availability of high quality, intense feedback driven by back and forth communication is the killer feature of in-person conferences. I don’t see any way to replicate that right now online.