Articles
The Story of Us — Wait but Why?
- An epic theory of human nature. Extensive notes here.
Knowable Magazine Annual Reviews: Building Bodies
- Very cool series on development. Notes here.
- Tangentially-related to CRISPR, but more a story about family and kids and how medicine and humanity can/are intersecting. Short and heartwarming read :)
The artist who co-authored a paper and expanded my professional network
- A Nature Careers column— awesome perspective (from both sides) on how fruitful it can be to have an artist-in-residence in your lab, no matter what you’re studying!
Research Readings
Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs
- Goal: Tool to study expression-phenotype relationship (in mammalian cells) with CRISPRi
1. Derived relationship between sgRNAs with diff mismatches & gene activity with deep learning
- Started with GFP reporter -> diff mismatches gave dynamic range of expression; mismatched -> generally unimodal distributions, but broader than the perfect gRNAs
- Derived rules of sgRNA activity by making library of guides targeting growth-related genes (each gene -> ‘allelic series’ of perfect match and ~20 variants with a few mismatches) -> CRISPRi; measured growth phenotype -> found relative activity of each guide
- Two mismatches usually -> no activity; one -> intermediate, depending location; other effects = weaker; most series had at least 2 guides with intermediate activity -> can titrate activity using these rules
- Also screened ~1000 variations on the constant region of gRNA by testing vs. ten essential genes; again derived & characterized rules of intermediate-activity variants ; however, more heterogeneity between series -> stuck with mismatches
- Used CNN to learn rules; trained 20 independent models (80% train, 20% x-val.
2. Constructed library of sgRNAs to control expression of ~2k essential genes
- Used ensemble to predict activities for all possible sgRNAs (57) for top five sgRNAs for ALL HUMAN GENES!! and hoped that using 3 predicted to have intermediate activity would yield at least 1, 95% of the time. Thus, have a library to titrate any gene.
- Made compact library for genes related to growth using first screen and CNN predictions, validated
3. “Stage” cells at different places on expression continuum; sc readout -> sharp transitions in behavior at certain expression thresholds
- Perturb-seq (pooled CRISPRi screen) -> matched capture of sgRNA & transcriptome; target 25 genes involved in essential processes with 5-6 guides each
- Assessed expression level of targeted genes: half of genes worked as expected with unimodal expression distributions, decreasing with better guides; others had bimodal distributions? Quantified as ‘fraction target UMI per cell;
- Some genes -> titration of overall transcriptional activity (total UMI counts); others genes -> specific & expected transcriptional responses
- Compared target gene expression vs. overall phenotype — bulk growth& tx response = well correlated; diff magnitudes & thresholds of activity required for diff genes
- Could follow sc trajectories for different levels of gene expression
Towards a comprehensive catalogue of validated and target-linked human enhancers
- We lack a complete catalogue of verified enhancers -> review new tech to discover, validate human enhancers at scale (reporter assays, CRISPR screens, biochemical measurements). Also propose framework to define enhancers, and accommodate heterogenous & diverse results, definitions of enhancers between various methodologies.
History of the enhancer concept
- Began 1981, with demonstration of non-coding DNA enhancing gene expression indep. of promoter & orientation of enhancing DNA, using episomal reporter vector; over next few years, ID’ed endogenous, mammalian, cell-type specific enhancer in IgH locus, then more
- Generalizations from examples -> genomic characteristics of enhancers: 1) Free of nucleosomes (DNAse I hypersensitivity), but flanked by some with specific histone modifications; 2) Contain clusters of TF binding sites, & TF binding underlies their activity; 3) Likely act by 3D looping to promoter.
- Initial strategies to locate enhancers = looking for non-coding conserved regions between mouse & human genomes (too many!), then lots of biochemical epigenetic assays: DNAse I hypersensitivity, bisulfite seq, TF immunoprec, histone mod — all genome-wide -> DNA-seq; RNA-seq revealed transcribed ‘eRNAs’. Heritability of many diseases attributed to ID’ed regions!
Proposed enhancer definition
- Validated & target-linked— distal element meeting three criteria:1) Deletion from native genomic context alters expression of potential target; 2) Evidence for cis-acting mechanism; 3) One orthogonal piece of evidence (reporter or biochemical assay) supporting enhancer-ness.
- Advantages over /operational/definitions based purely on reporter or biochemical assays, which offer incomplete evidence.
- New tech like sc seq of chromatin accessibility, higher res /chromosome confirmation capture (3C)/ methods, /massively-paralleled reporter assays (MPRAs)/, CRISPR screens -> can approach a definition closer to actual biological phenomenon.
Features identifying an enhancer
- In vivo functionality depends on 1) chromatin context and 2) trans-acting elements (TFs)
- /Classic model of action:/ recruit cell & condition specific TFs, then loop to interact with promoter. TFs can facilitate chromatin remodelling, recruit transcription machinery. However, lots of heterogeneity in possible mechanisms: can interact before activation, at other stages in mRNA synthesis, etc. Commonality = dependent on cell type & conditions, which determine TF binding.
- Other features used to characterize enhancers = sequence motifs, specific histone modifications, binding of TFs and other cofactors, 3D orientation. But none of these are defining: many exceptions, only moderate correlations with functional assays— again, heterogeneity!
- Ultimately defined by biological function: /“increasing the likelihood of transcription of one or more distally located genes through a cis-regulatory mechanism.”/
Current methods for scalable enhancer characterization
- Sequence analysis, biochemical annotations (ChIP-seq, DNAse-seq, ATAC-seq, etc.): limitations as touched on above; lots of false positives and don’t link with a target gene
- eQTL mapping (expression quantitative trait locus): correlate genome-wide genotypes in group of people (microarray or sequence genome) with transcriptome -> variants significantly assoc with gene expression diff = eQTLs. /Strengths:/ provide in vivo validation linking sequence to target; only way to test variants in actual humans. /Limitations:/ relies on natural variation, linkage disequilibrium (multiple variants in haplotype explain phenotype), relies on accessible tissue and cell types
Emerging methods
- 3D conformation mapping: (Hi-C, etc., pairing with biochem assays) /Strengths:/ Physical proximity of enhancer & target is required for activation! /Limitations:/ Disruptions to proximity don’t always disrupt expression; mobility also a factor
- Sc molecular profiling: (scATAC-seq et al., advances in microscopy) /Strengths/Allow correlations to be built from many cells rather than samples; very specific contexts & conditions. /Limitations:/ Still dep on biochem assays that require further fx validation
Tech for measuring enhancer activity
- MPRAs: Test position-indep activity within episomal vector; clone library of potential enhancers into vector and see if they activate gene with minimal promoter; abundance of gene’s barcode = readout. /Strengths:/ V high throughput; can test synthetic enhancers to enable unbiased screens & modelling. /Limitations:/ DNA synth length & cost constraints; interference from minimal promoter & differences from endogenous chromatin context (lots of false negatives)
- CRISPR screens of non-coding seq: Can directly measure effects of genetic or epigenetic perturbations on a certain gene’s expression. /Cas9 screens:/ Rules of disruption by indels less clear than in coding regions, limited by PAM sites, heterogenous edits to alleles, may only disrupt single binding motif & not reduce complete activity -> people try /long deletion/ approaches, but less efficient. /dCas9 screens:/ More consistent targeting of alleles; but synthetic interference & activation may not recapitulate actual activity -> false positives. /Whole-transcriptome screens:/ Overcome challenge of scalability by allowing all gene expression to be measured, not just one (phenotyping by scRNA-seq), but still expensive and limited to dCas9 bc toxicity of DSBs. /Future prospects:/ Figure out good way of validating ‘hits’ by deletion; determine rates of false-positives and negatives (esp bc many potential enhancers ID’ed by biochem assays don’t show effects by screens, and vice versa)
- Tech for in vivo validation: Mouse models— transgenic reporters (similar pros & cons to MPRAs), CRISPR knockouts. Limited to low throughout and elements conserved across mammals, but the best we can do.
Defining & cataloguing enhancers
- (Goal of ENCODE Consortium) Should move beyond a simple list of putative elements -> multimodal cell atlases of many cell types, validation both by MPRAs and CRISPR screens
- Include tiered framework to describe level of, heterogeneity in evidence & validation, based on criteria of “validated & target-linked” definition above
- Important to establish larger sets of true positives/negatives
- Remain open to evolving definitions— enhancers affecting splicing, whole-cell phenotype, etc.?
Maintaining Iron Homeostasis Is the Key Role of Lysosomal Acidity for Cell Proliferation
- Cells deprived of acidic lysosomes in culture eventually die; cancer cells dep on lyososome function… why? Found that key role is producing iron
- CRISPR/Cas9 screens for cells with altered pH (loss of what metabolic genes sensitize cells to lysosome pH inhibitors?) -> cholesterol synthesis and iron uptake = essential pathways; interesting bc used two different inhibitors, and top genes were diff (and not essential to other condition), suggesting diff metabolic responses to diff drugs, but same pathways; validated with knockouts of genes in these pathways
- cholesterol synth = necessary (addition of cholesterol rescued prolif when specific pathway was knocked out, but not when lysosome function was inhibited) but iron = necessary and sufficient for prolif when lysosomes were dysfunctional; trends of essentiality validated across RNAi databases
- Iron supplementation restores prolif independently of lysosome function or pH!
- Iron depletion from lysosome dysfunction -> altered HIF signaling & mitochondrial metabolism -> mechanisms of messing with cell prolif
Squalene accumulation in cholesterol auxotrophic lymphomas prevents oxidative cell death
- (Auxotrophic = can’t produce it by itself) Cells generally not auxotrophic for cholesterol, bc can synthesize de novo or uptake from outside, but certain cancer cells reliant on exogenous cholesterol
- Used competitive proliferation assay on pooled collection of DNA barcoded cell lines (28 cancer lines, lenti transduction with barcode) to find other auxotrophic lines; depleted media of lipoprotein -> couldn’t uptake cholesterol as well, affected prolif of a few lines
- Looked at database of transcriptomes of cancer lines -> one of the sensitive lines lacked SQLE mRNA mRNA and protein, blocking cholesterol synthesis pathway; then looked in database to find several more cell lines lacking SQLE (many ALK+ ALCLs)
- Negative selection CRISPR screen on different cancer lines targeting 200 metabolic genes to see what genes inhibit fitness of auxotrophic lines but not prototrophic ones; major hit = LDLR which allows for cholesterol uptake; validated in vitro and in vivo with CRISPR knockout & ab inhibition
- Next asked whether SQLE loss gave cancer cells some other advantage since loss of cholesterol synthesis doesn’t seem to be a good thing? Build up of certain metabolites like squalene? Knocked out gene upstream of SQLE so no squalene accum -> tumors grew slower than controls, didn’t do as well in competition assays; investigates mechanism -> may protect cells from lipid reactive oxygen and ferroptosis
- Big picture: use of systematic approaches (barcoding cells, CRISPR screens) to identify nutrient dependencies for specific cancers
- Integration of scRNA-seq data with longitudinal tx response in breast cancer cells into “mechanistic mathematical model of drug resistance dynamics” -> increase in predictive power vs either alone
- Despite attempts like RNA velocity, hard to get /dynamic/ info from omics datasets; likewise, optimisation of tx protocols by longitudinal studies of tumor response to drug = powerful, but doesn’t capture heterogeneity. Integration of two data types has been successful for looking at differentiation, but not yet for cancer (phenotypic transition not just change in cell state)
- Approach:
- Collect longitudinal, population level data on response to chemo (doxo)
- Collect snapshots of lineage-traces scRNA-seq data
- Build classifier to estimate proportion cells that are sensitive or resistant at diff time points
- Use estimations if cell number and phenotype to optimise model parameters
- Validated ability to predict responses to new tx protocols (used COLBERT barcoding system for lineage tracing)
- Phenotypic classification combined w transcriptome data -> mechanistic insights into hallmarks of resistance (enriched genes)
Mammalian gene expression variability is explained by underlying cell state
- Variability in gene expression within cell population can occur bc intrinsic factors (allele specific two-state transcriptional bursting, = noise, uncontrolled) & extrinsic factors (signals affecting many cells & genes, related to state). Difficult to deconvolute relative contributions of the two.
- Scale and sources of variability = important to interp single cell, multiplexed data
- /Approach:/ Utilized metrics of gene covariance and dispersion by measuring multiple cell state features (by TF activity, cell volume, Ca2+ stuff, etc.) and highly multiplexed gene expr (by MERFISH) in single cells respectively
- Covariance higher for genes in same pathway, so focused on network of Ca2+ response to ATP in epithelial cells (imp to wound healing)— nice bc fast timescale
- Multiple linear regression to decompose gene expression based on features measured— how much variance do they explain? Also added “hidden feature” based on first two PCs (accounts for bulk correlation not captured by chosen features?)
- Looked at dispersion in remaining (allele-specific variation) -> overall, transcriptional bursting played v little role in variability
ZipSeq : Barcoding for Real-time Mapping of Single Cell Transcriptomes
- Linking scRNA-seq to spatial dimensions, phenotypic profiling from microscopy -> “printed” DNA barcodes in spatially defined manner into cells, read out with seq: coated base oligo DNA onto cells in tissue by photocaging, could then control hybridisation with light
- Basically attach dsDNA to cells with something like an ab; the anchor strand has overhang blocked from base pairing, but is released with illumination -> sequentially add zipcode strands to hybridize to specific region
- Tested use of 2 zip codes by comparing “front” and “rear” cells from wound healing assay; outer and inner parts of lymph node (regions identified first with IF); profiled immune cells in marginal vs internal region of tumor
- Using 2 overhang strands and 2 zip codes -> 4 ROIs, got higher resolution, could detect gene gradients
- Nice bc can initially use microscopy methods to ID regions, and after barcoding, plugs into normal droplet-based scRNA-seq workflow!
CRISPR Interference-Based Platform for Multimodal Genetic Screens in Human iPSC-Derived Neurons
- Most CRISPR, CRISPRi screens done in cancer lines; here used neurons diff from iPSCs -> three screens: 1) survival-based, revealing essential genes and genes whose knockdown improved survival; 2) screen with scRNA-readout -> genes whose knockdown had v cell-type specific effects; 3) longitudinal imaging screen -> effects of knockdown on morphology
- First established line of i^3N iPSC (neurons) with dCas9-BFP-KRAB in AAVS1 locus; validated that CRISPRi didn’t cause toxicity (lenti transduction with sgRNA)
- Screened kinases & druggable target genes for iPSCs and iPSCs induced to become neurons; used new bioinformatics pipeline to anaylze to find hits (made full use of non-targeting controls to get FDR by comparing each set of sgRNAs targeting the same gene to 5 random controls). “Hit” genes usually had 1-2 significant sgRNAs, justifying depth of 5 sgRNAs/gene. Compared essential genes across cell types and to hallmark essential genes from cancer lines. GSEA -> neurons v dependent on cholesterol syntehsis pathway; validated by treating with drug; knocked down gene in pathway and showed supplying its product partially rescued phenotype
- Secondary screens with new lenti library to validate 182 hits & controls: CROP-seq (with sc readout) and high-throughput imaging (BFP nucleus)
- Repeat of survival screen validated primary results; also tested to see how presence of astrocytes with neurons changed dependence- only a few differences
- Used inducible CRISPRi to see whether the hit genes were necessary for neuron survival or differentiation (induced after neurons or as iPSCs); most for survival
- For CROP-seq, chose ~50 genes representing diff patterns & several controls; specifically looked at cells with high % knockdown (since essential genes, population enriched for low % knockdown). Knockdown of specific genes & functionally realted genes -> specific transcriptomic signatures! Ex. upregulation of genes in related pathways; showed that CROP-seq can identify functional clusters of genes. Also compared responses between iPSCs and neurons -> cell-type specific effects!
- Did arrayed screen with less genes tracking neurites and nucleus (for survival), will be more scalable if don’t need lenti sgRNA infection in the future; approach can be used to track more complex phenotypes- electrophysiological? Also can combine CRISPRa!
Capybara: A computational tool to measure cell identity and fate transitions
- Samantha Morris!
- Sparse scRNA-seq data -> difficult to use cell-type ID algorithms meant for bulk RNA-seq; sc tools (Garnett, SingleCellNet, etc) limited bc often require previous bio knowledge or classify into discrete groups
- Cabybara -> /unsupervised/ method to measure cell identity on /continuum/ at sc resolution; captures discrete identities but also cells with multiple identities, perhaps in transition -> metric for cell fate transition dynamics
- Benchmarked vs other classifiers; effective at IDing critical transitions in hematopoeisis, revealed unchar patterns in engineered/reprogrammed cells— and in vivo correlate!
- Approach: measure identity scores for each cell vs exhaustive public cell type collection using /quadratic programming/— “method previously used to evaluate cell progression between defined cell identities in direct lineage reprogramming” -> score identity as lin combo of all possible identities; can get both discrete class and multiple identities -> also define “transition metric”
- How it works:
- Tissue level classification: Use /annotated sc atlas/ (reference) and /sc expression profiles/ (sample) to do bulk-transcriptome-based quadratic programming (using ARCHS4 database) to get reference and sample tissue identity scores, revealing potential tissues involved. Helps select most relevant sc reference
- Continuous identity measure: Use /sc expression profiles/ and /tissue-specific, cell-type annotated sc atlas/ to do quadratic programming at sc resolution to get sample and reference identity scores
- Discrete cell type classification: Use those identify scores to do permutation test, get p-values for both, perform Mann-Whitney Test and Binarization
- Will read into math details later…
Podcasts
- Gimlet Media, telling the story of how Spotify acquired them. Such a cool & honest & personal story. So meta. I wanna listen to the rest of this show.
The Frontlines of Addiction with Beth Macy
- Episode of /Why Is This Happening? With Chris Hayes/ with the author of /Dopesick/, discussing the past and present of the opioid epidemic. Trying out the show as recommended by /The Atlantic/, and this particular episode because of a recent convo with Max.
- Worth a listen for the knowledge and stories, but not a huge fan of the interviewer or interviewee’s style.
- This got a little… uh… conceptual, for me. Nonetheless, Ezra Klein always keeps me engaged.
- A thrilling mystery! I listened on a whim, and so glad I did. The story, the myth of this isolated royal family in the Indian jungle; the process of discovery and humans connections by reporter Ellen Barry; the shocking! plot twist (if IRL surprises can be considered “plot”); and the overall stylistic production of the mini-series— narration, music, sound-clips, interview— had me hooked.
- And wow— not to mention the horror of the India/Pakistan partition.
(Unedited) Ezra Klein with Krista Tippet
- From On Being: a shameless two hour indulgence with my very favorite conversationalists. Their words cannot be summarised; they must be experienced.
- Both of their awareness of, and ability to articulate about their theories of society, of systems and structures, of human nature, and of themselves… are very powerful.
- From the book: “Idealogical extremism is in the eye of the beholder.”
Nature Podcast: February 20, 2020
- Haven’t listened to one of these in awhile!
- Apparently “charge anxiety” is a thing for people with electric cars. Not gonna lie, I get “gasoline anxiety” in my normal car… is that really an excuse? Anyway, science: what’s the optimal speed to charge a battery without damaging it?
- Will Che at Stanford: Instead of charging and decharging batteries with diff protocols to end of life, did it 1/10 the time and used ML to look for early signs of failure -> retested most promising ones /and/ similar protocols to hone in on best option.
- Surprising result (vs conventional, charge fast then slow): charge at constant rate! At least for the type of batteries they tried.
- Real point = generalisable ML approach to optimise any kind of battery, any property much more efficiently.
- People developed version of soy-based glue strong enough to hold up to building standards— environmentally-friendly alternative to petro-based glue currently used, and has anti-microbial, flame-retardant qualities
- How to harness energy from charged ions in the air? People can get little bursts from nanowire… but team at UMass accidentally discovered that /protein/ nanowires from /Geobacter sufurreductans/, which grow as filaments to conduct electrons, are actually really good at this!
Ezra Klein with Ta-Nehisi Coates
- So good, the both of them. I’ve really just gotta read Ezra’s book.
Hardcore History: Glimpses of Olympias
- It’s been awhile since I’ve listened to one of these! Dan Carlin’s pure enthusiasm for the intricacies of these seemingly irrelevant historical figures— the formidable mother of Alexander the Great, wife of Phillip II, in this case— and his ability to integrate their stories into so many fascinating contexts will never fail to entertain, or to educate.
Citations Needed: The Conservative Sanctimony of Journalistic Impartiality
- This podcast always makes me uncomfortable, because the hosts are extremely direct in their criticism of the ideologies and media I consume (and that I don’t, for what it’s worth), but I think it’s very important to hear what I might be overlooking in my own information bubble.
- Here, the history and implications of a commitment to so-called “impartiality” or “objectivity” in journalism… which is not, in reality, a thing that is real.
- Who sources are and how they are characterised matters! Also, what journalists are & aren’t allowed do in their free time.
- Quote from somewhere… “The pursuit of balance can produce /imbalance/.”
Song Exploder: Bon Iver — Holyfields
- Finally, finally, finally listened to this. So good.
- Justin Vernon founded the band Bon Iver! Who knew that just wasn’t his name, haha. There’s a lot of them!
Hakai Audio Edition: Crocodiles Rising
- This was super random, but interesting! An exploration of the terrifying balance between humans and salty crocs in the Northern Territories of Australia: humans “manage” and regulate populations by hunting and farming, crocs draw in tourism and sometimes eat people. Fantastic.
Books
Endure — Alex Hutchinson — 2/4/20
- A sweeping catalogue of how our /psychology/ interact with our /physiology/ in human efforts to push our physical limits. Simultaneously inspiring and grounded in the evidence, Hutchinson presents a nuanced analysis of the science of endurance.
Inheritance — Dani Shapiro — 2/6/20
- A beautiful memoir on identity and family, a singular telling of an increasingly-common story. Audiobooks read by the author are the best kind, and Shapiro’s rendition is a prime example!
- Recommended by the NYT a week or so ago. Got a little slow towards the final third, but I suppose it was necessary to wrap up the story arc. Glad I listened!
- The three spiritual questions (of something…): Who am I? Why am I here? How shall I live?
The Information — James Gleick — 2/13/20
- This was a tough one to get through… some interesting anecdotes and concepts scattered throughout, but they were lost amidst a lot of stuff I’d read about previously, or just… couldn’t bring myself to care about.
The Personality Brokers — Merve Emre — 2/19/20
- Average, I guess. Told the very weird origin story of the Myers-Briggs Type Indicator, created by a mother and daughter who were… very unqualified, and also kind of weird… Threw in some “wow” stats about just how much the MBTI has permeated society. Coulda been about half as long and just as effective.
Skin in the Game — Nassim Nicholas Taleb — 2/23/20
- This was so, so mind-numbingly terrible. Literally an incoherent rant. I don’t know what else to say.
Inconspicuous Consumption — Tatiana Schlossberg — 2/27/20
- This was absolutely fantastic. Schlossberg was authentic, and articulated the struggle I have with comprehending my place in the current environmental context so well. She presents dozens of under-told stories about how we (individually and collectively) impact our world with intelligence and a unique acceptance of messy complexity.
- I took lots of notes, but really, go read this one. Better yet, get the audiobook— Schlossberg narrates herself, and she’s delightful to listen to.