Structure and genomic organization of the rat aldolase B gene

Structure and genomic organization of the rat aldolase B gene

J. Mol. Hiol. (1985) 181, 153-160 Structure and Genomic Organization Rat Aldolase B Gene Ken-i&i of the Tsutsumil, Tsunehiro Mukai2, Reiko Tsutsumi...

2MB Sizes 0 Downloads 22 Views

J. Mol. Hiol. (1985) 181, 153-160

Structure and Genomic Organization Rat Aldolase B Gene Ken-i&i

of the

Tsutsumil, Tsunehiro Mukai2, Reiko Tsutsumi’, Soh Hidakal Yuji Arai2, Katsuji Hori2 and Kiichi Ishikawa’f ‘Department of Biochemistry Yamagata University School of Medicine Zaoh-iida, Yamagata 990-23, Japan 2Department of Biochemistry Saga Medical School Nabeshima, Xaga 840-01) Japan (Received 27 August 1984, and in revised fopm

26 September 1984)

The structure of the chromosomal gene encoding rat aldolase isozyme B has been elucidated by sequenceanalysis of cloned genomic DNA. This gene comprises about 14 x 103 base-pairs of DNA, and is separated into nine exons by eight intervening sequences. A presumed t,ranscription-initiation site was assigned by S1 nuclease protection mapping, and T-A-T-A and C-C-A-A-T boxes were found to be 25 and 126 base-pairs, respectively, upstream from this initiation site. There are three characteristic sequencesof 100 to 200 base-pairs within the region of 870 base-pairs flanking the 5’ side of the gene. These sequencesare flanked on either side by direct repeats and terminate with an A-rich stretch of nucleotides. One of them has block homology with a region in an “ID sequence”, which is reported to be an element for tissue-specific gene regulation and differentiation. The other two are analogous at the sequenceorganizational level with a sort of dispersed repeat, the “Alu family”. These features suggest that these regions are involved in gene regulation and, also, imply evolutionary events such as duplication or insertion. Comparison of this gene sequence with the rabbit aldolase A complementary DNA sequencerevealed some bias in the frequency of nucleotide replacement among the exons, suggesting seiective evolutionary conservation of particular exons encoding functional domains. Comparison with the human aldolase B complementary DNA sequence revealed no such tendency; the homology between the Dwo sequences was very high (about S9o/0), and nucleotide replacements were randomly distributed throughout the protein-coding region. 1. Introduction The glycolytic enzyme, fructose-1,6-bisphosphate aldolase (aldolase; EC 4.1.2.13) is a tetrameric protein composed of a specific combination of different subunits: A (muscle type), B (liver type) and C (brain type) (Penhoet et al., 1966). These isozymes have been extensively characterized with respect to their tissue-specific distribution, change in their concentration during development or carcinogexiesis, and enzymological characteristics (Horeckrr et nl.. 1972; Schapira et al., 1975; Lebherz. 1975). The genes encoding these subunits are not identical and are assumed to be located separately on t’he chromosomes, although they may have closely related structures (Penhoet et al., 1967; Henfield of al., 1979; Lai, 1975). These genes are, t Author to whom correspondence should be sent.

therefore, thought to have originated by duplication of a common ancestral gene during evolution. The expressions of these genes, however. show multiple patterns and seem to be regulated independently, but, in some instances! in a manner showing mutual influence; for example, (1) these three subunits are not all expressed simultaneously within the same cell or tissue; usually one or two of the three types are expressed in the “housekeeping” state, although in some fetal and hepatoma cells all three isozymes are expressed (Lebherz & Rutter, 1969; PenhoetJ et al., 1966; Lebherz, 1975); a.nd (2) increase in the concentration of a pa,rticular t)ype during development, or carcinogenesis is oft,en accompanied by decrease in the level of another preexisting type; e.g., the levels of the A and B t’ypes in the liver change reciprocally during development or hepatocarcinogenesis (Schapira et nl.. 1963; Matsushima ef al., 1968; Gracy et ml.. 1970: Tkehara, e:tni.. 1970: Schapira et al.. 1975: Numazaki

01 r/l., 1981). The study of these closely related aldolase genes is of much interest, with respect to whether the controls of their expression are reflections of their specific structures. especially in the regions regulating transcription, and whether the expressions of these genes are regula,ted in different ways. T’reviously, we and others isolated several complementary DNA clones for rat liver aldolase B and muscle aldolase A (Tsutsumi et nb.. 1983: Simon ~1
2. Materials and Methods (a) NateriaLs

Restriction endonucleases were obtained from Takara Shuzo

(Kpoto,

Japan)

and Bethesda

Research

Labora-

t’ories. Escherichia coli DNA polymerase I, bacterial alkaline phosphataseand polynucleobide kinase were from Takara Shuzo. S1 nuclease was from BoehringerMannheim. ];I-~‘P]ATP (3000 Ci/mmol) and other isotopes were from Amersham. (b) Scrrrning of the rat gene library (‘haron 4A containing partial EcoRI or HaeIJI digests of’ rat (Sprague-Dawley) genomic DNA were kindly provided by Drs T. D. Sargent, R. B. \Vallace. L. I,. ,Iagodzinsky and ,J. Bonnet-. The phage were inoculated into E. co/i &rain DPSOSupF (a gift from Dr Y. Fujii-Kuriyama) and screened by in situ plaque hybridization as described by Bent,on & Davis (1977). with nick-translated aldolase R complementary DXA (Tsutsumi et ~1.. 1983: K. Tsutsumi et al.. 1984) as a probe. DSA fragments of the aldolase B gene were subcloned int,o the EcoRI site of the plasmid pBR322 foi subsequent structural analyses. (c) S, nuclea,se

mappiq

S, nuclease protection mapping was performed essentially as described by Berk 8E Sharp (1977). The DNA fragment was labeled with 32P at the 5’ end, and dissolved in 0.3 M-NaOH and it’s strands were separated on a polyacrylamide/7 M-urea gel. The anti-coding strand was hybridized with liver poly(A)+ RNA. When the hybridizable region was expected to be short, hybridization was performed at 30°C for 3 h in 0.9 M-XaCl, containing 0.09 M-sodium citrat,e. The resulting hybrid was digested with S, nuclease (400 units/ml) at 35°C for 30 min. and the remaining DNA was separated on a polyacrylamide/T nr-urea sequencing gel (Maxam & Gilbert, 1977). Bands were detected by autoradiography.

C’ompletr or partial rvstriction rnzymr digests were analyzed by rlect,rophorrsis on agaroae or polyacrylamidr gels. Sequences were detertninrd by t,he provedurr of Maxam 8r Gilbert (1977). I;‘-“‘P]ATP and (r-32P]dideos~ ATP were used for terminal labeling of’ the 1)X:\

frapmrnts;.

3. Results and Discussion (a)

Isolation

and restriction mapj3ing aldolose K gene

o;f fhe

Previously we reported the isolation and sequence of complement’ary DNA clones for rat aldolase B. the nucleotide sequence corresponding to more than 9096 of the entire mRNX sequence (K. Tsutsumi rf al.. 1984). Using t’hesr complementary DXAs as probes in plaque hybridizat’ion, we isolated several genotnic clones from rat EcoRT and HaeIII gene libraries (Fig. 1). One of them. RAB-16, nas the first identified using complementary DXA as a probe. All the ot,her clones were isolat’ed in the same way using the insert. DNA fragment in RAE-16, except clone RAB-6, which was identified using the DljA fragment’ in RAB-10 as a probe. The insert DXA in RAB-6 has the most extended sequence in the 5’ direction of the gene. and contains the t’ransrription-initiation site, as described below. The restjrict’ion tnap of the aldolase B gene. deduced by analysis of cloned DNA fragments. is shown in Figure 1. Southern blot analysis (Southern. 1975) of the EcoR)I fragment of rat total genomic DNA, examined using either the above cloned DNA or complementjary DNA asa probe. gave bands of essentially the same size as those estimatjed for the cloned genomic D1L’A (data not shown). These findings provide evidence for a single chromosomal locus of the aldolase B gene. Therefore, the restriction map in Figure 1 should indicate the correct genomic organizat#ion of the rat’ aldolase B gene. Axons in the gene were first roughly located h? Southern blot hybridization of various restriction fragments of the cloned DNA using either nicktranslat)ed complementary DNA or 5’-32P-labeled partially- purified aldolase B mR;L’A (Tsutsumi B Ishikawa. 1981) as a probe, and t,hen determined exact)ly by sequence analysis with reference to the complementary DNA sequence determined previously (Tsutsumi ef al.. 1983: K. Tsutsumi et nl.. 1984). Tn this way. we located nine exons. as shown in Figure 1 (h)

Transcription-initiation nldolase

site

uf the

R gene

For clarification of the detailed exon-intron structure ofthe gene, the nucleotide sequencesaround the predicted exons were determined and compared with the complementary DNA sequence. Fot determinationofthe5’boundaryofthegene. however. we had to use S, nuclease protection mapping, since our cloned complementary DXA lacks part of the cap

Rat Aldolase

( a)

( b)

0I

I

I

2

I

1

4

I

B Isozyme

6

Gene

6

I

'2'3'

155 10

I

4

I

12

I

14 kb

'5'6'7=61t

CC)

Poly(A)

1560-1561 UAG 1174

(d)

Figure 1. Organization of the rat aldolase I3 gene. (a) Scale in lo3 base-pairs for (b). (b) The localizations of exons (I to IX, filled boxes) and introns (1 to 8) were determined by Southern blot hybridization analysis with complementary DNA as a probe, and sequence analysis as described in the text. The cleavage sites of several restriction nucleases used routinely are shown: E. EcoRI; H, HindIII; K, KpnI; P, P&I; X, XhoI: B, BarnHI. (c) Schematir represent,ation of aldolase B mRKA. The total length, the positions of AUG and UAG and splicing points are indicated as nucleotide residue numbers from the transcription-initiation site. I to IX correspond to the exons, as in (b). Numbers in parentheses indicate lengths of exons in nucleotides. (d) Three genomic clones RAB-6, RAB-10 and RAB-16. The exons are indic:tt,rd by filled boxes and are numbered as in (b). The positions of EeoRT cleavage sites are indicated as in (b).

an HhaI-AEuI fragment (161 base-pairs) containing part of the first exon in the 1.4 x lo3 base-pair fragment of the EcoRI digest of RAB-lODNA(subclonedinpBR322)wasisolatedand labeledwith 32Patits5’end.Thestrandsofthelabeled DNA were separated, and the anti-coding strand (complementary to the mRNA sequence) was hybridized with liver poly(A)+ RNA, and then subject’ed to S, nuclease digestion (Fig. 2). The prote&ed fragment was applied to a polyacrylamide gel together with the same [32P]DNA fragment processedfor sequencedetermination. Several bands offragmentsof28to31 base-pairsweredetectedon the gel. These corresponded to 5’ G-G-A-T 3’ in the sequence ladder, the most intense band corresponding to T. This point, -4in the coding sequenceof the gene, is probably the tra.nscription-initiation site (cap site) of aldolase B mRNA, since transcription of eukaryotic mRNAs often beginswith purines, and especially with A (Breathnach & Chambon, 1981). The first exon starting from this A contains the region that will hybridize with the region at the 3’ end of rat 18 S ribosomal RNA (Chan etal., 1984) (Fig. 3). At 25 basepairs upstream from the presumed cap site, there is the sequence 5’ T-A-T-A-A-A-A-A 3’, which is homologous to the promoter for transcription in eukaryoticgenes (T-A-T-A box) (Goldberg, 1979).The sequence5’ C-C-A-A-T 3’ t,hat is conserved in many

site. For this purpose,

other genes(C-C-A-A-T box) (Efstratiadis et al., 1980) was also found at position - 126 relative to the cap site. (c) 5’ Flanking

sequence

A flanking sequence of 870 base-pairs on the 5’ side of the gene was also determined (Fig. 3). In this region, t’here are scarcely any characteristic features frequently

of sequence organization. such as a repeated or tandemly arranged sequence

like t,hat seenin the viral enhancer element) (Benoist & Chambon, 1981: Banerji et aZ., 1981: Moreau et nl.. 1981: Gruss et al., 1981). However, unlike other regions. t,here are three A-rich sequences (at positions to -41).

-728 t,o -707, -435 t,o -414 and -62 These sequences show rnorp t,han 700,;

homology with each ot)her, and each ends with 5’ C-C-A-T-C-A-C-A 3’ or an equivalent sequence. In addition. sequences homologous to those immediat,ely

found

about

following

these

100 to

(indicated by horizontal Thus these A-rich blocks

A-rich

hlocaks

were

200 base-pairs upstream arrows in Figs 3 and 4). are located at’ the ends of

sequencesthat are flanked on eit’her side by direct repeats. These structural features imply the possible relation of these A-rich blocks to a certain type

of the repeated

sequence,

the Alu

family;

that

(a)

2oobp (b)

.

5’

3’

C c A A

G T : G :

\

x G G T C T A T 3’

1 G A T 5’

Figure 2. Location of the 5’ end of the aldolase B gene. (a) Restriction fragment used for Si nuclease protection mapping of the mRNA. A, Hh and E indicate cleavage sites of AU, HhaI and EcoRI, respectively. The strandseparated HhaI-AZuI fragment labeled at the 5’ end with 32P used for the experiment is indicated. (b) The St-resistant DNA fragment (Sl) was subjected to electrophoresis on a 10% (w/v) polyacrylamide sequencing gel. Amounts of S, nuclease used (units/ml) are shown above lanes. A DNA sequencing ladder prepared from the same fragment was used as a size marker. From left to right: G, A > C, T+C and C degradation products prepared by the method of Maxam & Gilbert (1977). Arrows indicate the bands of fragments protected against S1 nuclease treatment. bp. base-pairs.

Rat Aldolase B Isozyme Gene -800

CcTTcA~~CCGcmTCpdmcATA~~~T~T~C--------------

( 1.1Kb Asn

Gly

-----------AcAGI64GUiTTAmclTTcGTGTGTcrcc~T Pro

Ile

Val

Glu

Pro

Glu

Val

CCIAl-F

Gil GAG CC3 w\GGll

Tyr

Ser

Val

TAT msrr

al

Leu

Ala

Val

Leu

Pro

Cll

CCT G4T G!Y GN CAT G4C CTAGAG CAC TGCCAG

Asp

Gly

Asp

His

Asp

Leu

Glu

His

Cys

Gln

INTRON 6

Glu

Lys

MG ~TAc~G~CTC~AGTT~T~C~TCTT

Val

Leu

GGG CTG GTA

w

Ala

)

Tyr

Lys

Ala

Leu

Am

Asp

His

His

Val

Tyr

Leu

Glu

Gly

Tyr

TC TTG G!IT GCTGTC TACAPI; GCT CTCAAT G4T CAT CAT GTTTAC CTT GAG OXACC Leu

Leu

Lys

Pro

Asn

Met

Val

Tyr

Ala

Gly

His

Ala

Cys

Tyr

Lys

Lys

Tyr

Thr

Pro

CTG CTA AA(; CC4MC

ATG GTGACT @IT G!i4 CAT RX TGC ACC PA6 A4G TAC AC.4 CCT

Glu

Ala

Gln

Va‘

Ala

Met

Thr

Val

Thr

Ala

Leu

tlls

Arg

Thr

Val

Pro

Ala

Ala

Val

GAG CAAGTG GCTATG GCC ACC GTCACG GCT CTC C4C AGAACT GlT CCl GCA GCTGTG

7

INIXON

EE ~IGTAATGccrrccrrcrccccAGCrrA~~mmTCCrru\cA~ T&-$JKf-J,-jTTT&!$CC _--_ ---_--_-_-

(0.4,(b)

----------------cTG-A~CcT er

Ile

ArrcCrrmmCcA4AG~~,GU;CTAATATCATGCCTCTCTC ly

Gly

Met

Ser

Glu

Glu

Asp

Ala

Thr

Ser

Cys

Phe

Leu

Ser

G

ATC TGC lTT TTG TCT G

Asn

Leu

Asn

Ala

Ile

Tyr

Arg

Cys

Pro

GA ffiC ATGAGT GAG GAG GATGCT AUI CTTAAC CTC &IT GCTATC TAC CGTTGC CCT

1

Met

Ala

His

CTTGTATlTTRrGTTTGillGmC;mTmAGACL7FTCATC~GCT eu

Thr

Ser

Glu

Gin

Lys

Lys

Glu

Leu

Ser

Glu

Arg

Phe

Pro

Ala

L

CAC CGA TTT CCA GCC C

Ile

Ala

Gln

Arg

Ile

Val

Ala

Asn

G

TC ACC TCA G.4G CAG A4G A4G G% CTC TCC FAG ATT GCG CAG CGC ATT GTT GCC MT G ly

Lys

Gly

Ile

Leu

Ala

Ala

Asp

Glu

Se-r

Val

Pro

Arg

Pro

Trp

Lys

Leu

Ser

Phe

Ser

Tyr

Gly

Arg

Ala

Leu

Gln

Ala

Ser

Ala

CTA CCTAGG CCC TGGA4A CTAAGCTTTTCATAC

GGCAE4 GCC CTC CAG GCC AGT GC4

Leu

Lys

Ala

Ala

Trp

Gly

Gly

Lys

Ala

Ala

Asn

Lys

Ala

Thr

Gln

Glu

Ala

Phe

Met

TI'G GCT GCTTGG GGC GGC MG GCT GCAMC !UGL'AG GCAACC CAG GM GCT TTC ATG

2

INTRON

G

Leu

CG MG GGT ATC TTG GCT WI G4T GAGTCT GTG Ghi%%f&WGTCATGCCAUCAA& CACACCGn;CTTGU\CCTTCC1IW\I\GGCAGTTAGACA4 Ala

Asn

TCTTCTCITAUXi%%dGCC hr

Gly

Ser

Ser

Gly

Ala

Cys

Gin

Ala

Ala

Gin

Gly

Gln

Tyr

Val

His

T

MC TGT CAG GCAGCC CMGFGA CAGTAT GTI C4C A Ala

Ser

Thr

Gln

Ser

Leu

Phe

Thr

Ala

Ser

Tyr

Thr

Tyr

CG GGC TCG TCA GGC Gcr GCT TCC ACG LAG TC.4 CTC TTC AC.4 GCC TCC TAC ACC TAC ly

Thr

Met

Gly

1CLTAG G!I ACCATG n Arg

Arg

Gln

Asn

Arg

Leu

Gln

Arg

lie

Lys

Val

Glu

Asn

Thr

GE4AAC CR CTA CAC AGG ATAMG GTG G44MC

Phe

Arg

Glu

Leu

Leu

Phe

Ser

Val

Asp

Asn

Ser

Glu

Ile

Ser

Gln

C CGAAGG CAG TTC CGACUG CTC CTC TITAGTGTG GACPATTCTATCAGC le

Gly

Gly

Val

lle

Leu

Phe

His

Glu

Thr

Leu

Thr

Gin

Lys

Glu

As

ACT WGAGA4

Asp

SW

Ser

I

CAG AGC A Gin

Gly

Lys

TC GGC C&A GTGATC CTT TTC CAT CAG AC( CTC TAG CAG PA6 GAT AGC CA6 CXi4 PAG Leu

Phe

Arg

Asn

lie

Leu

Lys

Glu

Lys

Gly

lie

Val

Val

Gly

Ile

Lys

CTGTTCAGA MC ATT CTC A4G GAG A4G CG4 ATT GTG GTG C&C ATCMG INTRON

TINT&-----

(W8kb)

--~------------W\ATTCCGTCTCACTCCTGCTTG69CCCTTG e"

TCAAC~GllC4TTG~~

Asp

Gln

Gly

Gly

Ala

kTGGAC CMGG4GGT

Pro

ly

Gln

Tyr

Arg

lie

Ser

Asp

Lys

Gin

Ser

Asp

&l'[email protected]

--__--__--__-

Leu

Lys

Cys

Asp

Gly

Leu

Ser

Gl

CTT &AC GGC 'JC TCC GA Asp

Gly

A EL TGTGCT CAGTACPAG AAAG'X eu

Gly

GCCCCACAG4C

TGWAG!XX~CCCCCTCCCTT~m~& Ala

Ala

( 7.4 kb)

lllGTXG%TATTCilPA~CT~CT~

Cys

Leu

G&C CCGCTTGCAGGAACAMC

G $ ;;; $-- A; $ ~[,~:~“““”

" Arq

(GTGFr4TAcTc

3

Pro

Val

Asp

Phe

I33 GTC G4Cm

Ser

Ser

Leu

Ala

Gly

Lys

Trp

Arg

Ala

Val

L

GCGMG TGG CGTGCTGTG C Ile

Gln

Glu

Asn

Ala

As,,

Ala

TG AL% ATC TCG GAC CAG TGC CCT TCC AGC Cm GCT ATC CA4 EA4 MC GCC MC GCT Leu

Ala

Arg

Tyr

Ala

Ser

11~

Cys

Gln

Gln

INTRON

5

cr13 GcT cGc TAc Gee AGc ATC TGC GAG CAGJ(;TGCTCTCCCCCTCTCAA~~CACAGACCATTCCIG

11

GllGFlAClGmCACTGCTCTGCCTETW!

Figure 3. Pu’ucleotide sequence of the rat aldolase B gene. The sequence of all coding regions and parts of the flanking sequences are shown. The sequence is shown from left to right in the 5’ to 3’ direction. The presumed t,ranscription-initiation site and poly(A) addition site are indicated by vertical arrows. The C-C-A-A-T box. T-A-T-A box. putative ribosomal binding site. initiation codon (ATG). termination codon (TAG) and A-A-TA-A-,4 signal are all underlined with heavy lines. The broken lines and horizontal arrows in the 5’ flanking region indicate the A-rich sequences and direct repeats. respectively. that are discussed in the text. Lengths of intron sequences are shown in parentheses. Hyphens are omitted from the sequence in all Figures for clarity.

-810

-770

-790

-750

(a) GCAATCATTTLTTTT**A*T~GAATC;G--GAG~CIG--I~~”TGCCTG;~TGCCRAGCCTA,‘~~~~*~~~~~~* AM+ rtlrrbt+*+* ( b) TAAATAAA-TCTTTAAAAAAA~ACAAAAC~~~~~~~~~~~~~~~-~~~~~~~~~~-*G~~~~C~ -730

( a) (

b)

-710

-670

-690

TTATCCC;JCAAATAAATAAATGAATAG~T~CATCACAACA~AACAACAAGTAGGAATTCA~GAGTCAGCiiTGCTT * +* * +a +***sr --r,G~TGGGT~CGG~CCCCAGCTCCGAAAAAAGAACCAAAAAAAAAAAAAAAAACC

Figure 4. Comparison of the sequence in the 5’ flanking region of the aldolase B gene with the ID sequence. (a) The seyuenre from position -816 to -653. (b) The ID sequence in the second intron of the rat growth hormone gene (Sutcliffe el al., 1982; Barta et al.. 1981). The underlined region in (b) indicates the 82 base-pair ID consensus sequence. Homologous residues in the 2 sequences are indicated by asterisks. Horizontal arrows indicate direct repeats. Hyphens between nucleotides are omitted for clarity. but are used to indicate deletions made in either strand to achieve a bett,er fit, of homologous regions.

is dispersed throughout mammalian genomes (Jelinek & Schmid, 1982). The Alu families are thought’ to have been dispersed by duplicat#ion of unique DNA sequences at some target site on chromosomal DNA. One of these regions at about position -770 (Fig. 4) shows some homology with the “ID sequence”, which was first ident’ified in brain-specific complementary DNA and is expected to prescribe tissue-specific gene expression (Sutcliffe et d.. 1982). This TD sequencewas also found in the growth

hormone

gene (Sutcliffe

et al.. 1982; Barta

et

nl., 1981). These sequencesin the aldolase B gene might have an enhancer-like function (Moreau et al., 1981: Banerji et nl., 1981). Alternatively, the presence of Alu family-like or ID sequence-like sequences in the aldolase gene may indicate t’he occurrence of duplication or insertion of some DNA sequencenear these regions during evolution and, if so. the rearrangement of the gene st,ructure within the region immediately adjacent to the T-A-T-A box may influence or alter the system controlling gene expression. These possibilit’ies are interesting in relation to the acquirement of tissue-specific expression of isozyme genes from a common ancestral gene. These points require further examination. (d)

Organization

?f exon

and intron

structure

The aldolase B gene consists of nine exons and tight introns. The sequence of all the exons and parts of their flanking regions is shown in Figure 3. The aldolase U gene is about’ 14 x 103 base-pairs long from the transcription-initiation site to the poly(A) addition site. The lengths of the exons numbered in order in the transcriptional 5’ to 3’ direction are: exon I, 71; II, 122; III, 212; IV, 55; \‘. 161: VT, 84; VII. 175; VIII, 200; IX, 480 or 481 base-pairs. The size of exon IX cannot be determined until it is known whether the first A in the poly(A) tail is transcribed. The protein-coding region is split into eight exons (from exon TI to rson TX), which are all identical to the

corresponding regions in the complementary DNA sequence. The exon-intron boundary sequences in the gene are all referable to the 5’ G-T-A-G 3’ rule (Breathnach & Chambon, 1981). The lengths of t’he introns are as follows (in lo3 base-pairs): intron 1. 4.7; 2, 1.0; 3, 08; 4, 1.4; 5, 1.1; 6, 1.1; 7. 0.4: 8, 1.2. From the structural organization of the gene described above, the complete mRNA sequence can be constructed (Fig. 1). The total length from the cap site to the poly(A) addition site was deduced to be 1560 or 1561 nucleotides. The 5’ non-coding region from the cap site is 81 nucleotides long, and is located in exon I and II. The 3’ non-coding region of 387 or 388 nucleotides long, excluding the poly(A) tail, is entirely in exon 1X. The proteincoding sequenceis 1095 nucleotides long and is split, into eight exons, from the initiator ATG codon in rxon TI to the terminat’or TAG codon in exon IX. (e) Sequence

around

the poly(A)

addition

sitv

The last exon, which is the largest’ (480 or 481 base-pairs), contains all the 3’ non-coding region and part of the coding sequence for C-terminal amino acids. A consensussequencefor polyadeny-lation, A-A-T-A-A-A (Proudfoot & Brownlee, 1976), is found 21 base-pairs upstream from the poly(A) addition site. Two interesting feat’ures of the nucleot,ide sequence arc observed around t,hc poly(A) addition site. One is dyad symmetry making possible the formation of a stern-loop structure with t’he poly(A) addition site in the loop (Fig. 3). The other feature is complementary aeyuences that will hybridize to regions within the human small nuclear RNA U4 (U4 snRSA) (Fig. 5). Berget (1984) recently reported that 1‘4 snRNA rnay mediate the polyadenylation process in mRSA synthesis. This snRNA has regions c~otnplementary to A-A-T-A-A-A. and to a second consensuselement CA-Y-II-G (Benoist et nl.. 1980) surrounding the poly(A) addition site. Both these (onsensus elements are present in the aldolase II qew. and there are also two sequences t,hat may

Rat Aldolase

B Isozyme

Figure 5. Kucleot,ide sequence around the poly(A) addition site and possible mode of hybridization wit’h human 1’4 small nuclear RNA. Horizontal arrows facing each other indicate dyad symmetry, and arrows with broken lines indicate direct repeats. The poly(A) addition sit,e is indivatrd by vertical arrows. The heavy line indicates the A-A-T-A-A-A signal. Kucleotide residue numbers of t’he IT4 snRNA from the cap site are shown.

correspond to the latter element: T-A-C-T-G and (‘-AC’-T-G at 6 or i base-pairs and 14 or 15 basepairs. respectively, downstream from the poly(A) addition site. As shown in Figure 5, these regions could st’rictly hpbridize with the corresponding region in t.he U4 snRNA sequence. The transcription-termination site in the aldolase B gene is unknown. and of course, it is unlikely that the human a,nd rat) 114 snRNA sequences are suggested, these identical. However. as Berget feat,ures suggest’ that the two consensuselements. described aborr. in the initial transcript from the gene are recognized by 1-4 snRNA for cutIting and polyadenylation at, the specific site.

159

entire human aldolase B sequence deduced from complementary DNA and genomic clones. The amino acid sequence of rat aldolase B shows about 95”(, homology with t’he human counterpart. Detailed comparison showed about 89’?(, homology between the two protein-coding nucleotide sequences (Table 1). The 175 nucleotides in exon YII. which encode the active site lysine and it’s surrounding region, showed about 9oy, homology with the corresponding region in the human sequence. However. there was no apparent bias in the frequency of nucleotide replacement among the eight exons (II to IX) in the rat aldolase gene. The protein-coding sequences in these exons all have similar homologies (Go& to 93O,) to the corresponding region in the human sequence. although the coding region in exon T’TIT has the lowest homology- (859,). The 3’ non-coding region, which is located entirely in the last exon (IS), has no significant homology with the human cDh’A sequence. except for two conserved regions of about 50 nucleotides long (Besmond rt nl.. 1983; R)ottmann et ~1.. 1984). ln the protein-coding sequencesof rat aldolase B and rabbit aldolase =\ (Tolan et nl.. 1984). these relatively equal extents of nucaleotide replacements throughout t,he axons are less marked: the sequencesin exons TT. I-. VT and \‘TT show relatively higher conservation (74Yo to ifi”, homology) t’han those in other regions (60(?, to 6$C10).This tendency is more noticeable in the caorresponding amino acid sequences (Table 1). Exons V and VTT encode domains containing the site for the interaction with substrate and the active sit’e. respectively. the selective conservation

(f) Sequence comparison

of the exons with humarL aldokse I3 complementary DIVA4 and with rabbit aldolnse A complementary DXL4

Recently.

Gene

These features of functional

may reflect domains in

aldolases. and may also be related tn the difference in the enzymatic

Rot’tmann et rcl. (1984) reported the

properties

of the two

;&lolasrs.

such as the substrate specificities,

Table 1 Comparison of the wucleotide sequence of the protein-coding region in thf, rat a!ldolase B gene with other aldolase complementary DYA sequences Homology betwewl

aldolase

K pen? and

Rabbit

aldolase

(‘omplementar~ DN.1 (“,,)

A4 sequencrt (Amino wid) PO)

(‘ommcnts on pwtrin structure:

t The nucleotidr sequences of human aldolase B complementary DR’A and rabbit aldolase A complementary taken from Rottmann rt al. (1984) and Tolan et al. (1984), respertivelv. z I)ata from T,ai (197.5). Hertman it al. (1976) and Patthp et al. (1959).

1ISA are

160

h-. Tsutsumi

We are grateful to Drs T. 1). Sargent. R. B. Wallace and *J. Bonnrr for a generous gift of the rat #coRI gene library. and to Drs 1,. L. Jagodzinsky and J. Bonner for the HueIT gene library. We also thank Drs Y. Mishima. Y. Fujii-Kuriyama. M. Muramatsu and Y. Xabeshima foi valuable suggestions and preliminary R-loop analysis. Dt T. Tanaka for computer analysis and Mrs M. Seki foi Qping the manuscript. This work was in part, supported by Grants-in-aid from the Ministry of Educat)ion. Science and Culture of Japan. References Banerji. J.. Rusconi. S. & Schaffner, W. (1981). Cell, 27, 299-30s. Barta, A.. Richards, R. I., Baxter, J. D. & Shine, ,J. (1981). Proc. Nat. Acad. Sci., U.S.A. 78, 4867-4871. Benfield, P. A., Forcina, B. G., Gibbons, I. & Perham, B. L. (1979). Biochem. J. 183, 429444. Benoist, C. & Chambon. P. (1981). Nature (London), 290. 304-310. Benoist, C.. O’Hare, K., Breathnach. R. Br Chambon. P. (1980). Nucl. Acids Res. 8, 127-142. Benton, W. D. & Davis. R. W. (1977). Science. 196. 189 182. Berget, S. M. (1984). Nature (London), 309. 179-182. Berk, A. J. & Sharp, P. A. (1977). Cell, 12. 721-732. Besmond. C., Dreyfus, J-C., Gregori. C., Frain. M.. Zakin. M. M.. Trepat. J. S. S: Kahn. X. (1983). Biochem. Biophys. RPS. Commun. 117, 601-609. Breathnach, R. & Chambon, P. (1981). Anrw. Rr?~. Hiochem.

50.

349-383.

Ghan. \‘. L.. Gutell. R.. Noller, H. F. & Wool. 1. (:. (1984). ,I. Riol. (‘hem. 259, 224-230. Efstratiadis, A., Posakony, J. W., Maniatis, T.. Lawn. R. M.. O’Connel, C.. Spritz, R. A., DeRiel. *J. K.. Forget, B. G.. Weissman. S. M.? Slightom. J. L.. Blechl. A. E., Smithies, O., Baralle. F. E., Shoulders. C. C. & Proudfoot. X. ,J. (1980). Cell, 21, 6533668. Goldberg. >l. L. (1979). Ph.D. thesis. Stanford ITniversity, Palo Aho. (‘alifornia. Gracy. R. W., Lacko. A. G., Brox, L. W.. Adelman. R. (‘. Br Horerker. 1~. I,. (1970). Arch. Kiochem. Kiophys. 136, 480.--490. Gruss. P.. Dhar. R,. & Khoury. G. (1981). l’roc. X&. Acad. Sci.. 1’.8.A. 78. 943-947. Hertman, F. C. 8: Brown ,I. B. (1976). .J. Biol. (‘hem. 251. 3057-3062. Horrckrr, 1~. L.. Tsolas. 0. XI Lai, (1. Y. (1972). In The Enzymes (Bayer, P. D. ed.). vol. 7. p11. 213-358. Academic Press. h’ew York. Ikehara. T., Endo. H. & Okada. Y. (1970). .-lrc//. Rio&m. Hiophys. 136, 491-497. Jelinek. \\T. R. B Schmid. C. W’, (1982). A n?rtc. Kc/,. t?ioctwnr. 51. x1:3-844.

et al

Matsushima. (1968). 570.

T.. Kawabe, Rioche,m.

S.. Shibuya.

Hiophys.

RPS.

11. t-z Sugimura. (‘ornnc

nn

30.

7 565

Moreau. I’., Hen. R.. WasylJlk. B.. Everett. K.. Gaub. M. P. & Chambon. P. (1981). Xucl. ilcirls Hrs. 9. 6047 -6068. Mukai. T.. .Joh. K., Miyahara, H., Sakakibara, M.. Arai. Y. & Hori. K. (1984). Riochsern. Rinphys. lies. C’ommun. 119, 575-581. Numazaki. M., Tsutsumi. K.. Tsutsumi, R. & Ishikawa. K. (1984). Eur. gJ. Biochem. 142. 165- 170. Patthy, (‘.. Varadi. A.. Thesz. ,I. & Kovacs. K. (l!J’i!J). Eur. ,/. Hiochem. 99. 309-313. Penhoet. E. E.. Rajkumar. T. R. & Rutter. \I’. *I. (1966). Proc. ,V/xf. Acad. Sci., 1:.9..4. 56. 32751282. Penhoet. R. E.. Kockman, M.. Valentine, R. &. Rutter. IV:. .J. (1967). Biochemistry, 6, 2940-2949. Proudfoot. N. J. & Brownlee. (:. G. (1976). .Vaturu (London), 263, 21 l--214. Rottmann. W. H.. Tolan. 1). R. & Penhoet, E. E. (1984). Proc. .Vat. Aca,d. Rci.. Iv.S,A 81. 2738--2742. Schapira, F., Drryfus. ,J. (‘. Jt Schapira, G. (1963). LVatnrP (London), 200. 995-996. Schapira. F., Hatzfeld. A. & Webrr, :I. (1975). In lsozymes (Markert’. (‘. L.. ~1.). vol. 3. pp. 9X7 1003. Academic Press. Xew York. Simon, M-P.. Besmond. (‘.. (lottreau. I).. Wcber. A.. Chaumet,-Riffaud. I’.. Dreyfus, ,I .-(‘.. Trepat~. .I. S.. Marie. .J. & Kahn. .A. (1983). ,I. Riol. C’hrn~ 258. 14576~14584. Southern, E. (1975). J. Nol. Viol. 98, 503 -517. Sutcliffe, (Z.. Milner, R. J.. Bloom. F. E. & Lerner. I~. :I. (1982). 1’roc. Xat. =1crrd. Sci.. V.S.d 79, 494%4!446. Tolan, 1). It.. Amsden, A. I~.. Putnry, 8. I).. Irrdra. M. S. & Penhoet. E. E. (1984). J. Rlol. Chem. 259. 1127~ 1131. Tsutsumi. K. & Tshikawa. K. (1981). Hiochern. tiiophy,v. Res. f‘onInlu~n. 100. ‘W-412. Tsutsumi, K., Mukai. T., Hidaka. S.. Migahara. H.. Tsutsumi. R,., Tanaka. T., Hori, Ii. &, Ishika,wa. K. (1983). ,I. Biol. (‘hem. 258, 6537-6542. Tsutsumi. K.. Mukai. T.. Tsutsumi, R.. Mori. >I., Daimon, M.. Tanaka. T., Yatsuki. H.. Hori, K. & Tshikalva. K. (1984). J. Hiol. (‘hank. in t.hc press. Tsutsumi. R.. Txutsumi. K.. Sumazaki. M. & ishikawa. I<. (1984). Eur. .I. Biochwn 142. 16lGl6-C.

Edit& by P. Chccmbw