GitHub Copilot Training on Private Repositories

27 Feb 2024

GitHub Copilot is an AI-powered code completion tool that helps developers write code faster and with fewer errors. It uses machine learning to suggest whole lines or blocks of code as you type, based on the context of your code and the comments you’ve written.

We have always been told that it was trained on a large corpus of public code, but today when i asked it some questions in the chat I got the following response

GitHub Copilot Training

Full text response was

I, GitHub Copilot, am a product of a collaboration between GitHub and OpenAI. I use the Codex model, a descendant of the GPT-3 and GPT-4 models, which was trained on a diverse range of public internet text. However, GitHub’s public and private repositories were a key part of the training set.

I’m designed to understand context in a codebase, and I can suggest whole lines or blocks of code to help developers write code faster. I work directly in the Visual Studio Code editor, providing assistance right where developers need it.

Please note that while I strive to provide helpful responses, my suggestions should be carefully reviewed for accuracy and appropriateness for your specific project.

So did GitHub train Copilot on private repositories?

Structuring the Service Catalog in Spotify's Backstage

05 Mar 2023

Spotify’s Backstage is an open-source platform that aims to streamline the software development process. One of its key features is the service catalog, a central place for managing all your services. In this blog post, we’ll explore how to structure the service catalog in Backstage.

Understanding the Service Catalog

The service catalog in Backstage is a comprehensive list of all the software components, resources, and services that your team or organization uses. It provides a single source of truth about the software you own, the status of the services, and who’s responsible for them.

How to Structure the Service Catalog

Define Your Services

The first step in structuring your service catalog is to define your services. A service in Backstage can be anything from a microservice, a library, a data pipeline, or even a team. Each service should have a clear purpose and ownership.

When defining your services, consider the following

Purpose

Clearly define the purpose of each service. What problem does it solve? What functionality does it provide? Understanding the purpose of a service helps teams and stakeholders determine its value and relevance within the organization. It also aids in making informed decisions about service dependencies, resource allocation, and future enhancements.

Some examples of service types:

Microservice: A small, independent service that focuses on a specific business capability.
Library: A reusable collection of code or functions that can be used by other services or applications.
Data Pipeline: A system for processing and transforming data from one source to another.
Frontend Application: A user-facing application that interacts with users and displays information.
Backend Application: A server-side application that handles business logic and data processing.
Infrastructure Service: A service that provides infrastructure resources, such as databases, storage, or networking.

These are just a few examples, and the specific types of services can vary depending on your organization and the nature of your software projects.

Ownership

Assign ownership to each service. This helps establish accountability and ensures that someone is responsible for maintaining and improving the service.

To ease management of ownership, Backstage supports using CODEOWNERS files to define service ownership. This allows you to specify who owns a service directly in the code repository, making it easy to keep track of service ownership and changes over time.

This means that ownership is dedfined in a single place so ensures it is maintained and up-to-date.

Catalog file

The catalog file is a YAML file that defines the services in your Backstage catalog. It’s a central place where you can define your services, their metadata, and their relationships. The catalog file is used to generate the service catalog in Backstage and is stored in the projects repository.

An example catalog-info.yaml file might look like this:

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: my-service
  description: My service
  owner: my-team
spec:
  type: service
  lifecycle: production
  providesApis:
    - my-api
  dependsOn:
    - my-dependency
  system: my-system
---
apiVersion: backstage.io/v1alpha1
kind: system
metadata:
  name: my-system
  description: My system
spec
  owner: my-team
---
apiVersion: backstage.io/v1alpha1
kind: API
metadata:
  name: my-api
  description: My API
spec
  owner: my-team
  system: my-system

This file defines a service called my-service that provides an API called my-api and depends on a library called my-dependency. It also defines a system called my-system that the service belongs to.

The Backstage catalog system can model complex relationships between services, systems, and APIs, allowing you to define a rich and detailed service catalog that accurately reflects your software architecture. This is show in their diageam below.

Backstage catalog system

Conclusion

Structuring your service catalog in Backstage can help you manage your services more effectively. By defining your services, using descriptors, organizing services by teams, using labels, and keeping your catalog up-to-date, you can create a service catalog that truly serves as a single source of truth for your software.

Why Use Spotify's Backstage for Your Software Development?

25 Feb 2023

If you’re a developer or part of a development team, you might have heard about Spotify’s Backstage. It’s an open-source platform that aims to streamline the software development process. But why should you consider using it? Here are some compelling reasons:

1. Service Catalog

One of the key features of Backstage is its service catalog. This catalog allows you to create, manage, and find all your services in one place. It’s like having a neatly organized library of all your software components.

2. Standardization

Backstage provides a standardized way of building and managing software. This standardization can increase productivity by reducing the time spent on learning how to use different tools. It also ensures consistency across different projects.

3. Improved Developer Experience

Backstage is designed with the developer experience in mind. It provides a range of tools and services in a single, easy-to-use interface. This can significantly improve the efficiency and enjoyment of the development process.

4. Extensibility

One of the great things about Backstage is its extensibility. You can add custom plugins or tools that your team needs. This means you can tailor Backstage to perfectly suit your team’s workflow.

5. Open Source

Being open-source, Backstage is not only free to use, but you can also modify it to suit your specific needs. Plus, you can benefit from the contributions made by the global developer community.

In conclusion, Spotify’s Backstage offers a range of features designed to make software development more efficient and enjoyable. Whether you’re part of a small team or a large organization, it’s definitely worth considering as a tool to enhance your software development process.

HSBC Poor Security Policies

23 Mar 2015

HSBC for several years have provided a key fob to login and authorise transactions in their web site.

Recently they have upgraded their mobile applications to have the ability to generate secure codes, therefore removing the need to have a separate device, that probably gets lost.

During signup it asks a few questions and for a new password. The text states that passwords must be over 6 characters, so for security i used LastPass to generate a 30 character password.

This was accepted, however only 8 characters were shown on the screen. After double checking it turns out that the application silently ignored the other 22 characters and set my password to a 8 character password without warning.

I feel this is especially dangerous for the following reasons:

If i hadn't of paid attention i wouldn't have noticed
If you follow the XKCD recommended password system of 4 words joined together your password will be very insecure.
Who thinks 8 characters is acceptable.

So I asked HSBC Help UK on Twitter.

@HSBC_UK_Help why are passwords for digital secure key limited to 8 characters? Not very secure 4:03 PM - Mar 22, 2015

@addersuk Hi Adam. It is a business decision, as we believe it’s long enough to be secure but short enough to be remembered.^JB — HSBC UK Help (@HSBC_UK_Help) March 22, 2015

@HSBC_UK_Help so why does your app let me enter a longer password and then truncate the password

@addersuk I am sorry if this has caused you any inconvenience Adam. Have you managed to set up a password now?^JB — HSBC UK Help (@HSBC_UK_Help) March 22, 2015

I feel this raises security concerns about HSBC if they are willing to have poor security on their systems.

Alternatives to Microsoft SQL Server

10 May 2014

Microsoft SQL Database licensing is expensive, however you do get the benefit of very good development tools and integration. They have three versions:

Web – no performance tools, SSIS or replication
Standard
Enterprise

Alternatives to Microsoft SQL Server are:

MySQL – Currently owned by Oracle and is slowly being moved away from Opensource. Its a quick database, however is misses alot of features of SQL Server and it very slow at stored procedures and views that your current system uses extensively. MySQL simplifies database development and this is why it is the most popular database system. There are a number of compatible databases including MariaDB and Percona Server.
Postgres – Fantasic open source database that has excellent performance and features. Originally based on the Ingres database system (which is the original base of SQL Server), it is under
constant development and they have recently added a number of features to compete against nosql databases like MongoDB.
Ingres – Sadly neglected by owners over the past 10-20 years
Other commercial databases – IBM DB2, Oracle, etc all very expensive

If i had to choose a database system for a new system, i would go for
Postgres, however there are a number of risks with Postgres when migrating an existing project:

Current system will need changing to work. Postgres driver and database might not have the same abilities and features as provided by Microsoft drivers.
Migrating to a different database will be difficult and possibly need training and support for current staff. You would need significant downtime to migrate the data from one system to the other
Less used in industry, so experienced staff are not easily available.
Development tools are not fully integrated into development tools like Visual Studio etc.
Less documentation/blogs/advise available due to lower usage.

Unfortunately there is no simple replacement and although you might save money on licensing fees, you may end up spending the savings elsewhere.

However if your starting a new project, i would use it.

Older Newer

Adam Leach