Knot Invariants Research Using Supervised Learning

As a junior at BYU, I was hired by a research group in the mathematics department headed by Dr. Mark Hughes, who specializes in low-dimensional topology, including knot theory. Knot theory involves more data than most areas of pure mathematics research—a theoretically infinite number of knots, thousands of which have been characterized and catalogued, each described by dozens of mathematical attributes (“invariants”)—so he created the research group to explore data-oriented approaches to the study of knots.

The question that came to occupy most of my time was this: could machine learning uncover unexpected interactions among knot features? In other words, could it identify large-scale patterns in knot invariants that, according to known theory, should not be there? If so, it would hint at latent mathematical structure that topologists could then investigate.

I used KnotInfo1, a database of all 12,965 prime knots with crossing number up to 13 and all their known invariants.2 (A prime knot is one that isn’t made by connecting two smaller prime knots; the crossing number is the minimum number of crossings in any diagram of the knot). Most invariants are known for lower-crossing knots, but as the number of crossings grows, so does the difficulty of calculating invariants; thus, not every knot in the database has a value for every invariant. But there’s enough to work with. I also used Matlab code Dr. Hughes had previously written to pseudo-randomly generate knots of arbitrary crossing number, so our dataset could include knots larger than crossing number 13.

All invariants included (if known) for each knot in the KnotInfo database

Here are the details for the approach that proved successful (meaning that among a lot of predictable, boring results I found a single unpredictable, interesting one). I set out to see if I could accurately predict any of a knot’s invariants using only its Jones polynomial.

Data Validation

First I filtered out knots for which the Jones polynomial was not definitively known—in KnotInfo, where the lower bound did not equal the upper bound, or in the generated dataset, where the Jones calculation returned “failed.”

Encoding the Features

Example Jones polynomial: V(t)=t3+3t23t1+33t+2t2t3V(t) = -t^{-3}+3t^{-2}-3t^{-1}+3-3t+2t^2-t^3

Since Jones polynomials are strings, I had to parse them into a format machine learning models could accept. I wrote a custom parser to turn them into lists where the index corresponds to the exponent, in the order 0, 1, -1, 2, -2, …, and the value is the coefficient. The example above would have become [3, -3, -3, 2, 3, -1, -1]. Each list index becomes its own feature column.

Different knots have different Jones polynomial lengths, so I padded the end of each list, appending zeros until each was as long as the longest list in the dataset. Generally, filling missing values with zeros would be bad practice, but in this case it is an appropriate solution; each list element represents the coefficient of one term in a polynomial, so a list element being nonexistent is mathematically equivalent to it being zero (e.g., x^2 = x^2 + 0x + 0). This made the inputs consistent and ready to input into a neural net.

Model Setup

Since I wanted to predict as many invariants as possible using only the Jones polynomial, I made it so I could pass in the name of an invariant and a model would train using Jones as the input and the named invariant as the target. This was a baseline framework I would have expanded had I stayed with the research group longer; every invariant, some with various possible encodings, could have been either an input or an output of this generalized model pipeline, and I then I could have compared every 1:1 input:output combination, or which combinations of invariants were more predictive than any one alone.

I first used Jones to predict all the integer invariants, since those were the easiest to encode; you can either treat them as numbers or treat them as categories using a label encoder, and both are simple to set up. My model architecture was a simple neural network using the Keras API over TensorFlow, trained independently for each target invariant. These were my prediction accuracies on each test set:

From here, my ability to contribute was nil; I had no training in topology, so I didn’t know what any of these invariants were or whether evidence of their connection to the Jones polynomial was significant. I emailed the results to Dr. Hughes’s, who said the connection between the Jones polynomial and the Arf invariant was well known, but the result for Genus-4D was a little more suprising. Additionally, he later discovered a bug in his Matlab code that was causing it to produce invalid knots, and once it was fixed, the accuracy of 4D genus predictions jumped even higher, to 97.7%.

Unfortunately, I left the research group soon afterward, but Dr. Hughes told me it continued to be a significant research question for him for some time. A group of physicists in South Africa he worked with expressed interest in the connection with the 4D genus, and he sent my results on to them for further testing. Meanwhile, I got a new job, and I lost the trail of the research. But it was fun while it lasted!

Code

import sys
import numpy as np
import os
import pandas as pd
import math
import datetime
import random
from sklearn.model_selection import train_test_split
import tensorflow as tf

pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', None)
generated_knots = pd.read_csv('Train 1.txt')
knots_for_training = pd.read_csv('Database Knots.csv')    # Knots from database
og_knots_for_predicting = pd.read_csv('Predict 1.txt')    # For making predictions
def GetOneJonesList(generatedJones):
    temp = generatedJones.replace('[','').split('],')
    temp[-1] = temp[-1].replace(']','')
    for i in range(len(temp)):
        temp[i] = temp[i].split(',')
        temp[i][0], temp[i][1] = int(temp[i][0]), int(temp[i][1])
    
    newJones = []
    n = 0
    while n <= max( abs(temp[0][0]), temp[-1][0] ):
        zero = True
        for i in temp:
            if i[0] == n:
                newJones.append(i[1])
                zero = False
        if zero:
            newJones.append(0)
        if n > 0:
            n *= -1
        else:
            n -= 1
            n *= -1
    return newJones

def GetInvariantLists(generated_knots, other_invariant):
    if other_invariant == 'Genus-4D':
        other_invariant = 'Slice genus lower bound'
    
    cleaned_dataframe = generated_knots[~generated_knots['Jones Polynomial'].str.contains('Failed')]
    cleaned_dataframe = cleaned_dataframe[generated_knots['Slice genus lower bound'] == generated_knots['Slice genus upper bound']]
    
    jonesLists = list(cleaned_dataframe['Jones Polynomial'])
    otherList = list(cleaned_dataframe[other_invariant])
    
    newJonesLists = []
    for i in jonesLists:
        newJonesLists.append(GetOneJonesList(i))
    return newJonesLists, otherList

def GetNumber(jones, j):
    number = ''
    while j < len(jones) and jones[j].isdigit():
        number += str(jones[j])
        j += 1
    return int(number), j

def GetExponent(jones, j):
    if j >= len(jones) or jones[j] != '^':
        return 1
    else:
        exponent = 0
        if jones[j+1].isdigit():
            exponent, useless = GetNumber(jones, j+1)
        elif jones[j+1] == '(':
            positiveExponent, useless = GetNumber(jones, j+3)
            exponent = positiveExponent * -1
        return exponent

def ProcessSegment(jones, i):
    position = 0
    if jones[i+1] == 't':
        if i == -1:
            value = 1
        elif jones[i] == '+':
            value = 1
        else:
            value = -1
        
        if i+2 >= len(jones):
            position = 1
        else:
            position = GetExponent(jones, i+2)
    elif jones[i+1].isdigit():
        value, nextIndex = GetNumber(jones, i+1)
        if jones[i] == '-':
            value = value * -1
        if nextIndex >= len(jones) or jones[nextIndex] == '+' or jones[nextIndex] == '-':
            position = 0
        elif jones[nextIndex] == '*':
            position = GetExponent(jones, nextIndex + 2)
    return position, value

def GetJonesDictionary(jones):
    jonesDictionary = {}
    if (jones[0] == 't') or jones[0].isdigit():
        position, value = ProcessSegment(jones, -1)
        jonesDictionary[position] = value
    for i in range(len(jones)):
        if (jones[i]=='+' or (jones[i]=='-' and jones[i-1]!='(')):
            position, value = ProcessSegment(jones, i)
            jonesDictionary[position] = value
    return jonesDictionary

def GetJonesList(jones):
    jonesDictionary = GetJonesDictionary(jones)
    powers = list(jonesDictionary.keys())
    radius = max(abs(min(powers)), abs(max(powers)))
    
    jonesList = []
    jonesList.append(jonesDictionary.get(0, 0))
    
    for i in range(1, radius+1):
        jonesList.append(jonesDictionary.get(i, 0))
        jonesList.append(jonesDictionary.get(-i, 0))
    return jonesList

def ReformatJonesPolynomials(knots):
    multiJonesList = []
    generated = 'Jones Polynomial' in knots.columns
    
    if not generated:
        for i in knots.index:
            jones = knots['Jones'][i]
            multiJonesList.append(GetJonesList(jones))
    else:
        multiJonesList, useless = GetInvariantLists(knots, 'Genus-4D')
    return multiJonesList
def GetIndicesToRemove(knots, columnTitle):
    indices = []
    invariants = list(knots[columnTitle])
    for i, inv in enumerate(invariants):
        if isinstance(inv, (int, float)):
            if math.isnan(inv): indices.append(i)
        elif isinstance(inv, str):
            if not inv.isdigit(): indices.append(i)
    return indices

def ShuffleLists(joneses, other):
    c = list(zip(joneses, other))
    random.shuffle(c)
    return zip(*c)

def FilterTerms(joneses, jones_terms):
    longestJones = len(max(joneses, key=len))
    newJoneses = []
    for jones in joneses:
        newJones, n = [], 0
        for index in range(longestJones):
            val = jones[index] if index < len(jones) else 0
            if n in jones_terms:
                newJones.append(val)
            n = (n - 1) * -1 if n <= 0 else n * -1
        newJoneses.append(newJones) 
    return newJoneses

def ListPrep(generated_knots, Indiana_database_knots, nonJonesInvariant, jones_terms, shuffle, test_split):
    badIndices = GetIndicesToRemove(Indiana_database_knots, nonJonesInvariant)
    joneses = ReformatJonesPolynomials(Indiana_database_knots)
    other = list(Indiana_database_knots[nonJonesInvariant])
    
    for i in reversed(badIndices):
        joneses.pop(i)
        other.pop(i)
    other = list(map(int, other))

    if not generated_knots.empty:
        gen_jones, gen_other = GetInvariantLists(generated_knots, nonJonesInvariant)
        joneses.extend(gen_jones)
        other.extend(gen_other)

    if shuffle:
        joneses, other = ShuffleLists(joneses, other)
    if jones_terms:
        joneses = FilterTerms(joneses, jones_terms)
    
    if test_split:
        return train_test_split(joneses, other, test_size=test_split)
    return joneses, other

def PadTensors(t1, t2):
    s1, s2 = t1.shape[1], t2.shape[1]
    if s1 > s2:
        t2 = tf.pad(t2, [[0,0], [0, s1-s2]], 'CONSTANT')
    elif s1 < s2:
        t1 = tf.pad(t1, [[0,0], [0, s2-s1]], 'CONSTANT')
    return t1, t2

def GetInputTensors(generated_knots, Indiana_database_knots, nonJonesInvariant, jones_terms=[], shuffle=True, one_hot=True, test_split=0.2):
    if test_split:
        jTrain, jTest, oTrain, oTest = ListPrep(generated_knots, Indiana_database_knots, nonJonesInvariant, jones_terms, shuffle, test_split)
        jTrain = tf.ragged.constant(jTrain).to_tensor()
        jTest = tf.ragged.constant(jTest).to_tensor()
        jTrain, jTest = PadTensors(jTrain, jTest)
        
        if one_hot:
            numOutputs = int(tf.math.reduce_max(oTrain) + 1)
            oTrain = tf.one_hot(oTrain, depth=numOutputs)
            oTest = tf.one_hot(oTest, depth=numOutputs)
        return jTrain, jTest, oTrain, oTest
    else:
        jones, other = ListPrep(generated_knots, Indiana_database_knots, nonJonesInvariant, jones_terms, shuffle, test_split)
        jones = tf.ragged.constant(jones).to_tensor()
        if one_hot:
            numOutputs = int(tf.math.reduce_max(other) + 1)
            other = tf.one_hot(other, depth=numOutputs)
        return jones, other

def GetPredictionTensors(to_predict, jonesTrain, jones_terms=[]):
    to_predict = ReformatJonesPolynomials(to_predict)
    if jones_terms:
        to_predict = FilterTerms(to_predict, jones_terms)
    else:
        to_predict = tf.ragged.constant(to_predict).to_tensor()
        to_predict, _ = PadTensors(to_predict, jonesTrain)
    return to_predict
def TrainAndEvaluateModel(generated_knots, knots_for_training, knots_for_predicting, invariant_name, jones_terms, test_split, optimizer_function, loss_function, metrics, epochs, batch_size, verbose, predict):
    tensors = GetInputTensors(generated_knots, knots_for_training, invariant_name, jones_terms, True, True, test_split)
    
    if test_split:
        jTrain, jTest, oTrain, oTest = tensors
    else:
        jTrain, oTrain = tensors

    if predict:
        knots_for_predicting = GetPredictionTensors(knots_for_predicting, jTrain, jones_terms)

    input_shape = jTrain.shape[1]
    output_shape = oTrain.shape[1]

    model = tf.keras.Sequential([
        tf.keras.layers.InputLayer(input_shape=(input_shape,)),
        tf.keras.layers.Dense(30, activation="softplus"),
        tf.keras.layers.Dense(30, activation="softsign"),
        tf.keras.layers.Dense(20, activation="tanh"),
        tf.keras.layers.Dense(10, activation="selu"),
        tf.keras.layers.Dense(10, activation="elu"),
        tf.keras.layers.Dense(output_shape, activation="softmax")
    ])
    
    model.compile(optimizer=optimizer_function, loss=loss_function, metrics=[metrics])
    model.fit(jTrain, oTrain, epochs=epochs, batch_size=batch_size, verbose=verbose)
    
    if test_split:
        model.evaluate(jTest, oTest)

    return model, knots_for_predicting
  1. C. Livingston and A. H. Moore, KnotInfo: Table of Knot Invariants, knotinfo.org, 1 Jan 2022. ↩︎
  2. I’m not going to keep track of updates to the database, so it may have changed since I was working on this project. This is just a description of what the database included at the time. ↩︎