Leveraging AI to Design the Next Generation of Genome Editing Proteins: My Journey from Data to Design

I started doing work in biology not in a lab but with big sets of data that were complex and hard to understand. As a computational biologist working with next-generation sequencing like single-cell RNA sequencing my big challenge was figuring out how cells behave. I learned how gene expression changes across tissues, how small groups of unusual cells can cause diseases and how the situation around the cells is much more important than just looking at averages. Biology is really about understanding cells and biological data is what I work with. But over time I started to wonder about something that really bothered me: when we know these systems well, why are the tools we use to edit genomes still so basic?

This leads me to think about genome editing and eventually towards artificial intelligence.

CRISPR systems, originally evolved as RNA-guided immune defenses in microbes, became programmable genome editing tools through guide RNAs that direct nucleases such as Cas9 and Cas12 to specific DNA targets. While this programmability transformed biology, these enzymes were not evolved for therapeutic precision, cell-type specificity, or delivery constraint, the exact challenges that define my goal of building safer and more efficient genome editing systems.

Working with high-resolution transcriptomic and single-cell data shaped how I see this limitation. Editing outcomes vary across cell states, tissues, and disease contexts, yet most genome editors are treated as fixed tools. My goal therefore shifts from simply selecting targets to designing editors that account for biological variability, specificity requirements, and real-world delivery constraints.

Artificial intelligence enables this transition from analysis to design. By learning sequence-function relationships across large CRISPR-Cas datasets, AI makes it possible to engineer compact, high-fidelity genome editors tailored to defined biological tasks. In this way, my objective is not only to understand cellular systems through data, but to use that understanding to design next-generation genome editing proteins that are context-aware, therapeutically viable, and intentionally engineered rather than naturally inherited.

Evolution explores sequence slowly; AI can explore it intentionally.

Now, I stopped thinking about which mutation I should try. Now I think about what this editing task needs and can a computer program really design a protein that fits those needs.

Deep learning models made it possible to co-optimize activity, specificity, PAM recognition, and even compatibility with delivery constraints. Designing the guide RNA used to be done by making educated guesses. Now this problem can be solved using data.

Using structure prediction is really helpful. It gives me another reason to trust the results. When I use tools like AlphaFold with intelligence I can easily go from creating a sequence to understanding what the structure looks like. This is even useful for proteins that do not exist in the world. Structure prediction helps me answer a question right away: will the protein structure work and is it physically possible? The ability to do this saves me months of waiting and wondering what will happen in experiments with the protein structure.

What gets me really excited is not one particular model or tool, it is the whole closed-loop workflow that is coming together. Artificial Intelligence designs proteins. It guides them. Then we do high-throughput experiments to check if they are valid. The results of these experiments go back into the Artificial Intelligence model. Each time we do this the Artificial Intelligence system gets smarter. This is not just about making things easier, with automation it is a new way of doing biological engineering with Artificial Intelligence.

At Cellogen Therapeutics, we explored this direction through the development of ‘CasAInova’, a platform aimed at designing compact Cas9 and Cas12 variants powered by deep learning trained on large sequence datasets. The goal is to generate smaller, high-fidelity editors compatible with delivery systems such as AAV and LNP while maintaining activity and specificity.

“The central shift is conceptual: genome editors are no longer discovered, they are designed.”

I think that in the future we will have genome editing systems that can change and adapt. The proteins will be engineered to be safe and to work well from the start. AI is not replacing biological insight, it is amplifying it, allowing us to design with intention rather than approximation.

For me, this journey from transcriptomic landscapes to AI-designed proteins feels less like a change in direction and more like a natural evolution. Biology has always been about patterns. AI simply gives us the language to read them and now, to write new ones.

Genome editing

Kashif Husain

Junior Scientist (Manufacturing & QC)