A new five-parameter Birnbaum–Saunders distribution for modeling bicoid gene expression data

A new five-parameter Birnbaum–Saunders distribution for modeling bicoid gene expression data

A New Five-Parameter Birnbaum-Saunders Distribution for Modeling bicoid Gene Expression Data Journal Pre-proof A New Five-Parameter Birnbaum-Saunder...

523KB Sizes 0 Downloads 0 Views

A New Five-Parameter Birnbaum-Saunders Distribution for Modeling bicoid Gene Expression Data

Journal Pre-proof

A New Five-Parameter Birnbaum-Saunders Distribution for Modeling bicoid Gene Expression Data Hossein Hassani, Mahdi Kalantari, Mohammad Reza Entezarian PII: DOI: Reference:

S0025-5564(19)30516-4 https://doi.org/10.1016/j.mbs.2019.108275 MBS 108275

To appear in:

Mathematical Biosciences

Received date: Revised date: Accepted date:

24 June 2018 29 October 2019 29 October 2019

Please cite this article as: Hossein Hassani, Mahdi Kalantari, Mohammad Reza Entezarian, A New Five-Parameter Birnbaum-Saunders Distribution for Modeling bicoid Gene Expression Data, Mathematical Biosciences (2019), doi: https://doi.org/10.1016/j.mbs.2019.108275

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Inc.

Highlights • An extended version of Birnbaum-Saunders distribution with five parameters is introduced. • Theoretical aspects of five-parameter Birnbaum-Saunders distribution and the maximum likelihood estimation of parameters are presented. • An accurate mathematical model for bicoid signal extraction is proposed. • Birnbaum-Saunders with five parameters is introduced and evaluated to describe the spatial gradient.

1

A New Five-Parameter Birnbaum-Saunders Distribution for Modeling bicoid Gene Expression Data Hossein Hassani∗ , Mahdi Kalantari† , Mohammad Reza Entezarian‡ ∗ Research

Institute of Energy Management and Planning (RIEMP), University of Tehran, Tehran 1417466191, Ira [email protected] † Department ‡ School

of Statistics, Payame Noor University, 19395–4697, Tehran, Iran.

of Mathematics, Statistics and Computer Science University of Tehran, Tehran, Iran

Abstract An extended version of Birnbaum-Saunders distribution with five parameters is introduced. Theoretical aspects of five-parameter Birnbaum-Saunders distribution and the maximum likelihood estimation of parameters are presented. The reliability and applicability of the proposed distribution is evaluated using both simulation and real-world data namely bicoid gene expression profile. The findings of this research confirm that the newly proposed five-parameter Birnbaum-Saunders distribution can be utilized to describe the distribution of bicoid gene expression profile. Keywords: Birnbaum-Saunders distribution; maximum likelihood estimation; bicoid gene; gene expression profile.

1

Introduction

The Birnbaum-Saunders (BS) distribution is a probability distribution used extensively in reliability applications to model failure times. This distribution, which is also known as the fatigue life distribution, originated from a problem of material fatigue proposed by Birnbaum and Saunders [1]. This model has received great attention in recent decades due to its many attractive statistical and probabilistic properties. Although this distribution was originally proposed in the field of material fatigue and reliability analysis, it has been widely applied to various fields beyond the original context such as earth, environmental and medical sciences; see for example [2–6]. The classical BS distribution contains two parameters; shape (α) and scale (β). The random variable T that follows the two-parameter BS distribution is denoted by T ∼ BS(α, β) and its cumulative distribution function (cdf) has the following form r !! r 1 t β F (t) = Φ , t > 0, α, β > 0, (1) − α β t where Φ (·) is the cdf of standard normal distribution. There is an interesting relationship between this model and the standard normal distribution. If T ∼ BS(α, β), then s r ! 1 T β Z= − ∼ N (0, 1). (2) α β T 2

For a detailed review on two-parameter BS distribution see [7]. Moreover, the random variable T follows the three-parameter BS distribution [8], denoted by T ∼ BS(α, µ, β), if its cdf is given by s !! r 1 t−µ β F (t) = Φ − , t > µ, α, β > 0, µ ∈ R, (3) α β t−µ where the parameter µ represents the location of distribution. Similar to the above situation with two parameters, the relationship between this model and the standard normal distribution can be represented as follows: s s ! T −µ β 1 − ∼ N (0, 1). (4) Z= α β T −µ Another extension of BS distribution, is based on Johnson’s transformation and includes four parameters (FBS), denoted by T ∼ F BS(δ, λ, ξ, γ) [8], which the following cdf F (t) = Φ (a(t)) , where a(t) = γ + δ

t > ξ,

δ, λ > 0,

r

s

t−ξ − λ

λ t−ξ

ξ, γ ∈ R,

(5)

!

(6)

.

The parameters δ, λ, ξ and γ are the shape, scale, location and non centrality parameters, respectively. The probability density function (pdf) of FBS distribution is as follows: f (t) = φ(a(t))At ,

t > ξ,

ξ, γ ∈ R,

δ, λ > 0,

(7)

where φ (·) is the pdf of standard normal distribution and −3 δ At = √ (t − ξ) 2 (t − ξ + λ) . 2 λ

If T ∼ F BS(δ, λ, ξ, γ), then Z =γ+δ

r

T −ξ − λ

s

λ T −ξ

!

(8)

∼ N (0, 1).

(9)

The quantile function (qf) and hazard rate of FBS distribution was studied in [8]. If T ∼ F BS(δ, λ, ξ, γ), then the qf and hazard rate of T are given by

Qp =



−1

λ  Φ (p) − γ + 4 δ h(t) =

s

φ(a(t)) At , Φ(−a(t))

Φ−1 (p) δ

−γ

t > ξ,

2

2

+ 4 + ξ,

δ, λ > 0,

0 < p < 1,

ξ, γ ∈ R.

(10) (11)

where Φ−1 (p) is the pth quantile of standard normal distribution. The FBS distribution includes the two-parameter and three-parameter versions of BS distribution as particular cases. Since it can be easily seen that if γ = ξ = 0, then T ∼ BS(α = 1/δ, β = λ), and if γ = 0, ξ = µ, then T ∼ BS(α = 1/δ, µ = ξ, β = λ). Therefore, the pdf, qf and hazard rate of the two-parameter and three-parameter versions of BS distribution can be obtained from FBS distribution. For a detailed information on FBS distribution see [8]. 3

Recently, the BS distribution has been used as a mathematical model to describe the distribution of bicoid gene expression profile. For instance, fifty-four commonly used distributions were evaluated in [9] and authors concluded that BS distribution with three parameters fits the bicoid profile more accurately than the other distributions. It should be noted that fitting an accurate statistical distribution to bicoid profile is of great importance because of significantly stochastic, highly volatile with a heavy tail and non-normal structure of these data. The aim of this research is to introduce an extended version of BS distribution with five parameters. Our motivation for extending the BS distribution is recent findings in [10] where authors proposed the FBS distribution to describe the expression profile of bicoid gene. The high performance of FBS distribution in fitting an adequate statistical distribution to bicoid gene expression profile encouraged us to apply an extended version of it including five parameters. This new distribution is the extension of FBS distribution based on Marshall and Olkin’s method, which includes an extra parameter making it more flexible, thus widening the applications of this distribution. This paper is organized as follows. In Section 2, the pdf, cdf, qf, hazard rate and Maximum Likelihood Estimation (MLE) of parameters of five-parameter BS distribution are proposed. Section 3 provides an extensive simulation study and in Section 4, the five-parameter BS distribution (5BS) is also evaluated on the real data including all the cleavage cycles in which bicoid gene is expressed in the embryo. The paper concludes with a concise summary in Section 5.

2

A new five-parameter BS distribution

In this section, we propose a new five-parameter BS distribution, which is the extension of FBS distribution using Marshall and Olkin’s method [11]. This new distribution with five parameters denoted by 5BS. The resulting distribution is more flexible than the original distribution owing to an additional parameter. The pdf, cdf, qf and hazard rate of 5BS distribution are carried out in this section. Finally, we propose the log-likelihood function and equations to obtain ML estimates of distribution parameters.

2.1

Probability and Cumulative Distribution Function

Let F (t) be the cdf of a continuous random variable T . The associated Marshall-Olkin (MO) extended distribution has cdf given by G(t) =

F (t) F (t) = , 1 − η¯F¯ (t) F (t) + η F¯ (t)

η > 0, t ∈ R,

(12)

where η¯ = 1 − η and F¯ (t) = 1 − F (t). The pdf of MO extended distribution is ηf (t) , η > 0, t ∈ R, (13) {1 − η¯F¯ (t)}2 where f (t) is the pdf of random variable T . It can be easily seen that for η = 1, G(t) = F (t). Therefore, the MO extended distribution includes the original distribution as a special case. Now, let us extend the FBS distribution by Marshall and Olkin’s method. We denote the random variable T that follows the MO extended FBS distribution by T ∼ 5BS(δ, λ, ξ, γ, η) and, according to (5) and (12), its cdf is as follows: g(t) =

G(t) =

Φ(a(t)) Φ(a(t)) = , ¯ ¯ 1 − η¯Φ(a(t)) Φ(a(t)) + η Φ(a(t))

t > ξ,

δ, λ, η > 0,

ξ, γ ∈ R,

(14)

¯ (·) = 1 − Φ (·) is the survival function of standard normal distribution and a(t) is where Φ defined in (6). Using (5), (7) and (13), the pdf of 5BS distribution is 4

g(t) =

ηφ(a(t)) A, 2 t ¯ {1 − η¯Φ(a(t))}

t > ξ,

δ, λ, η > 0,

ξ, γ ∈ R,

(15)

where At defined in (8). Suppose T ∼ 5BS(δ, λ, ξ, γ, η). It is evident that if γ = ξ = 0, η = 1, then T ∼ BS(α = 1/δ, β = λ); if γ = 0, ξ = µ, η = 1, then T ∼ BS(α = 1/δ, µ = ξ, β = λ), and if η = 1, then T ∼ F BS(δ, λ, ξ, γ). Therefore, the 5BS distribution contains all the previous versions of BS distribution as particular cases. Figure 1 shows the plots of pdf of 5BS distribution for δ = 2, λ = 3 and some other parameter values. These plots illustrate that how the extra parameter η makes the four-parameter BS distribution more flexible. It can be easily seen from Figure 1 that the value of η has a significant effect on skewness and kurtosis of 5BS distribution. ξ = 1, γ = 0.6

ξ = −1, γ = 1

0.6

0.8

1.0

η = 0.25 η = 0.5 η = 1 (FBS) η=2 η=5 η = 20

0.4 0.2 0.0

0.0

0.2

0.4

g(t)

0.6

0.8

1.0

η = 0.25 η = 0.5 η = 1 (FBS) η=2 η=5 η = 20

2

4

6

8

10

0

t

2

4

6

8

10

t

Figure 1: Plots of the pdf of 5BS distribution (δ = 2, λ = 3).

2.2

Quantile Function, Hazard Rate and TTT Plot

Suppose that Qp is the quantile of order p of 5BS distribution, that is, p = G (Qp ). Then we have Φ(a(Qp )) p= , (16) ¯ 1 − η¯Φ(a(Q p )) and hence

Qp = a

−1

   pη −1 Φ 1 − p¯ η

(17)

where a−1 (·) is the inverse of a(t), which is defined in (6). Therefore, using (17), the qf of 5BS distribution is given by 2  s 2 λ k−γ k−γ Qp =  + + 4 + ξ, 0 < p < 1, (18) 4 δ δ

  pη where k = Φ−1 1−p¯ . For η = 1, the qf of FBS distribution in (10) is obtained. The qf (18) η enables us to generate a random variable having 5BS distribution using the inverse transform 5

method. Let U be a uniform (0, 1) random variable. The random variable T = G−1 (U ) = Q(U ) has the 5BS distribution. The median of this distribution is given by M e = Q( 12 ) as  2 s 2 λ k−γ k−γ Me =  + + 4 + ξ, (19) 4 δ δ   η −1 . where k = Φ 1+η In general, the hazard rate of random variable T is defined by f (t) h(t) = ¯ , (20) F (t) where F¯ (t) = 1 − F (t) is the survival function of T , f (·) is the pdf of T and F (·) its cdf. Therefore, using (14) and (15), the hazard rate of 5BS distribution is defined a follows: φ(a(t)) At , t > ξ, δ, λ, η > 0, ξ, γ ∈ R. (21) (1 − η¯Φ(−a(t))) Φ(−a(t)) It is worth mentioning that hazard rate can be used for studying the behavior of a statistical distribution. For instance, if a distribution is absolutely continuous, the hazard rate uniquely determines it [12]. The hazard rate can be increasing, constant (exponential distribution), decreasing, ∪-shaped or bathtub, ∩-shaped or inverse bathtub, and upside-down bathtub [7]. The hazard rate of FBS distribution (Relation (11)) can be easily assessed using (21) by setting η = 1. Figure 2 depicts the plots of hazard rate of 5BS distribution for δ = 2, λ = 3 and some other parameter values. These plots indicate that the hazard rate of 5BS distribution has a variety of shapes depending on the values of parameters. h(t) =

ξ = 1, γ = 0.6

ξ = −1, γ = 1

1.0

1.5

η = 0.25 η = 0.5 η = 1 (FBS) η=2 η=5 η = 20

0.5 0.0

0.0

0.5

h(t)

1.0

1.5

η = 0.25 η = 0.5 η = 1 (FBS) η=2 η=5 η = 20

2

4

6

8

10

0

2

t

4

6

8

10

t

Figure 2: Plots of the hazard rate of 5BS distribution (δ = 2, λ = 3). The Total Time on Test (TTT) function is another suitable means to describe the shape of the hazard rate [12]. This function and its scaled version are defined as follows: Z F −1 (u) −1 H (u) = (1 − F (y)) dy, (22) W (u) =

0 −1

H (u) , H −1 (1) 6

0 < u < 1.

(23)

The scaled TTT function W (u) can be approximated by empirical version of it which is defined as   Pk T(i) + (n − k)T(k) k = i=1 Pn Wn , k = 0, . . . , n, (24) n i=1 T(i) where T(i) is ith order statistic. Plotting the consecutive points (k/n, Wn (k/n)) offers the empirical scaled TTT plot, can be utilized to identify the type of hazard rate that the data might have [8]. If the TTT curve is concave (or convex), then the increasing hazard rate (or decreasing hazard rate) family is appropriate. Additionally, if the TTT curve is first concave (or convex) and then convex (or concave), a bathtub (or inverse bathtub) hazard rate should be used. Moreover, if the TTT curve is a straight line, the exponential distribution should be employed [8]. More details on TTT function can be found in [12].

2.3

Maximum Likelihood Estimation

Let T1 , T2 , . . . , Tn be a random sample of size n from the 5BS distribution. Having the pdf in (15), the log-likelihood function for the unknown parameter vector θ = (δ, λ, ξ, γ, η)T is given by n X l(θ) = log g(ti ) i=1

n X ¯ = {log(η) + log(φ(a(ti ))) + log(Ati ) − 2 log(1 − η¯Φ(a(t i )))} i=1

n

1X 2 n a (ti ) = nc + n log(η) + n log(δ) − log(λ) − 2 2 i=1 n

n

(25)

n

X X 3X ¯ log(ti − ξ) + log(ti − ξ + λ) − 2 log(1 − η¯Φ(a(t − i ))), 2 i=1 i=1 i=1

where t1 , t2 , . . . , tn is the observed sample and c = −1 log(2π) − log(2) is the constant that does 2 not depend on θ. Now, by taking the partial derivatives of l(θ) with respect to δ, λ, ξ, γ and η, the likelihood equations can be obtained as follows: n n 1 X ti − ξ − λ √ −√ B(ti ) = 0, δ ti − ξ λ i=1 n n −n X 1 δ X ti − ξ + λ √ + + B(ti ) = 0, 2λ t − ξ + λ 2λ3/2 i=1 ti − ξ i=1 i n

n

n

X 3X 1 1 δ X ti − ξ + λ − + √ B(ti ) = 0, 3/2 2 i=1 ti − ξ t − ξ + λ (t − ξ) 2 λ i i i=1 i=1 n X B(ti ) = 0,

(26)

i=1

n X

¯ Φ(a(t i )) = 0, ¯ 1 − η¯Φ(a(ti )) i=1  T 2¯ η φ(a(ti )) b b b b where B(ti ) = a(ti ) + 1−¯ . The ML estimate θ = δ, λ, ξ, γ b , η b of θ = (δ, λ, ξ, γ, η)T can ¯ η Φ(a(t i )) be obtained by solving the likelihood equations given in (26), simultaneously. These equations can not be solved analytically. However, they can be solved by a numerical method. In this research, we use the function mledist from R package fitdistrplus to get ML estimates. More information on this package can be found in [13]. n −2 η

7

3

Simulation Study

A simulation study is first utilized to assess the applicability of the proposed 5BS distribution to real bicoid expression profiles. The Synthesis Diffusion Degradation (SDD) model is a common model for analyzing the bicoid profile and follows an exponential curve [14–17]: B = Ae(x/λ) ,

(27)

where A is the amplitude, x is distance from the anterior, and λ is the length parameter. Here, an exponential curve with A = 150 and λ = 30 is used. Furthermore, normally distributed random noise with zero mean and different variances were added to various parts of the series to generate a noisy data. In total 100 observations were generated and the simulation was repeated 5000 times. Here, Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) tests are used, and the pvalues of these tests are considered as the criterion for decision making after each step of simulation. Figure 3(a) depicts the box plots for the value of test statistic for KS and AD tests. According to the results, the variation in the AD test statistic is greater than KS test statistic. Figure 3(b) displays the box plots of p-value for KS and AD tests confirming that that the 5BS distribution fits all simulated data very well. Figure 3(c), also illustrates the box plot of MLE of δ, λ, ξ, γ and η. These box plots reveal that the variance of scale parameter λ is greater than other parameters. The standard error of ML estimators reported in Table 1 indicate that the shape parameter (δ) has the lowest standard error compared to other parameters. (b) p−values





30

● ●

(c) MLE of parameters

1.0

(a) Test Statictics





1.5



● ● ● ● ● ● ● ● ● ● ● ● ●

20 10

0.6

1.0

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.8

● ●



0.4

● ●

0.5



● ● ● ●

0

0.2

● ● ● ● ● ● ● ● ● ● ● ● ● ●



● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ●

● ● ● ● ● ● ●

● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ●





0.0



KS

AD

KS

AD

δ

λ

ξ

γ

η

Figure 3: The box plot of test statistics, p-values and MLE of parameters. parameter standard error

δ λ ξ γ η 0.215 5.424 1.317 0.696 0.356

Table 1: Standard error of ML estimators in simulation study.

4

Real Data Application

In this section, we apply the 5BS distribution to describe the data of cleavage cycles 10, 11 and 12 in which bicoid gene is expressed in the embryo. A detailed description of different cleavage 8

cycles can be found in [9]. Table 2 presents the p-values of KS and AD tests together with MLE of parameters obtained in fitting 5BS distribution to these data. The standard errors of ML estimators are reported in parenthesis. Similar to the simulation study, the results obtained here confirm that 5BS distribution fits very well all real data sets at %1 significance level. In addition, in all cases except BECab7, the standard error of scale parameter λ is greater than other parameters. KS AD δ λ p-value p-value ab2 0.877 0.724 0.712 (0.063) 20.006 (10.127) BECab7 0.442 0.458 0.211 (0.034) 0.714 (0.259) BECab11 0.997 0.977 1.077 (0.508) 15.872 (26.675) BECac30 0.673 0.940 0.917 (0.384) 61.993 (93.481) Embryo

ab18 ac22 ad14 ad22 ad23 BECab14 BECab17 BECab19 BECad4 cb16

0.786 0.499 0.343 0.547 0.542 0.739 0.997 0.981 0.937 0.171

0.450 0.335 0.152 0.675 0.370 0.552 0.999 0.993 0.744 0.126

0.903 0.862 0.945 0.954 0.994 0.817 0.864 1.035 0.540 0.803

(0.078) (0.079) (0.074) (0.079) (0.193) (0.194) (0.326) (0.430) (3.545) (0.063)

ab17 ad4 ad6 ad16 ad27 be4 be7 BECab10 BECac12 BECac36

0.196 0.446 0.084 0.052 0.317 0.083 0.096 0.772 0.595 0.569

0.083 0.301 0.041 0.034 0.182 0.035 0.017 0.567 0.524 0.388

0.757 0.822 0.740 0.810 0.845 0.831 0.783 1.325 0.787 0.965

(0.035) 27.013 (7.441) (0.056) 22.003 (7.997) (0.042) 21.995 (6.423) (0.039) 29.997 (8.901) (0.070) 17.000 (6.420) (0.039) 42.998 (11.980) (0.033) 25.998 (6.396) (0.128) 46.048 (27.567) (0.092) 13.000 (7.283) (0.189) 13.014 (8.529)

ξ 0.455 (0.497) -0.010 (0.007) 3.841 (4.298) 1.758 (4.930)

23.014 (11.185) 2.691 (0.817) 21.015 (10.029) 4.176 (0.794) 33.008 (14.824) 5.146 (1.134) 24.992 (13.256) 1.800 (0.844) 15.007 (10.550) 2.514 (1.579) 15.005 (12.607) 11.503 (1.574) 6.720 (7.666) 5.091 (1.200) 12.038 (15.732) 11.157 (2.914) 2.690 (38.402) 20.074 (14.821) 22.506 (9.788) 4.046 (0.667) 8.326 4.046 4.156 5.940 7.958 0.250 2.919 7.529 6.504 0.897

(0.369) (0.574) (0.427) (0.486) (0.582) (0.433) (0.360) (2.513) (0.973) (1.351)

γ

η

-0.296 (0.800) 0.535 (0.463) -1.675 (0.475) 0.106 (0.099) -1.387 (2.127) 0.254 (0.356) 2.470 (0.817) 0.501 (0.334) -0.473 -0.531 -0.443 -0.455 -1.255 -1.171 -1.557 -1.597 -2.839 -0.444

(0.743) (0.704) (0.667) (0.858) (0.900) (1.012) (1.333) (1.556) (4.467) (0.638)

0.481 0.519 0.436 0.337 0.444 0.425 0.431 0.390 0.290 0.450

(0.319) (0.334) (0.249) (0.246) (0.297) (0.364) (0.476) (0.407) (0.890) (0.278)

-0.244 -0.539 -0.402 -0.265 -0.727 -0.069 -0.277 -0.612 -2.715 -1.455

(0.431) (0.556) (0.439) (0.450) (0.546) (0.430) (0.375) (0.996) (0.337) (0.810)

0.555 0.592 0.906 0.444 0.645 0.293 0.575 0.190 0.014 0.359

(0.247) (0.307) (0.401) (0.197) (0.318) (0.125) (0.219) (0.111) (0.010) (0.226)

Table 2: KS and AD p-values and MLE of parameters. Figure 4 depicts the box plot of estimated parameters at three cleavage cycles of 10, 11 and 12. Similar to simulation findings, it can be easily seen from this figure that the variation of scale parameter λ, among all cleavage cycles, is greater than other parameters. It is noteworthy that molecules encoded by bicoid begin to appear in the embryo at cleavage cycle 10 making the lengths of the bicoid profiles of cycles 10 and 11 notably lower than the consequent cycles [10]. Hence, as can be seen, the accuracy of parameter estimation in particular for the scale parameter (λ) is inversely related to the length of available profiles. Figures 5 and 6 show the correlation between the ML estimators of parameters computed numerically using Hessian matrix at three cleavage cycles of 10, 11 and 12. These figures provide an evidence of positive correlation between ML estimators of δ, λ, γ and η in most of embryos. Moreover, the parameter ξ has negative correlation with other parameters in most of cases. Figures 7-11 depict the empirical scaled TTT plots of all embryos (first row) along with the hazard rates of corresponding 5BS distribution (second row) based on MLE of parameters δ, λ, ξ, γ and η (reported in Table 2). These empirical scaled TTT plots enable us to detect the shape of hazard rates well. In cleavage cycle 10, the hazard rate of embryo BECab7 is decreasing. However, the shape of almost all hazard rates in cleavage cycles 11 and 12 are upside-down bathtub. 9

Cleavage Cycle 11

Cleavage Cycle 12

10

5

10

20

10

20

30

15

20

40

30

25

50

40

30

60

Cleavage Cycle 10



● ●

0

0

0



● ●

δ

ξ

λ

γ

η

δ

ξ

λ

γ

η

δ

λ

γ

ξ

η

Figure 4: Box plots for MLE of parameters.

ab2

γ η ξ λ δ 0.51 −0.69 0.43 0.29 λ −0.6 0.87 0.62

BECac30

1 0.8

γ η ξ λ δ 0.77 −0.88−0.08−0.97

0.6

0.6

0.4

0.4

0.4

0.2

λ−0.94 0.54 −0.8

−0.2

ξ−0.34 0.87

−0.8 −1

γ η ξ λ δ 0.76 −0.66 0.77 0.72 λ−0.73 0.58 0.42

1

γ η ξ λ δ 0.71 −0.79 0.62 0.41

0.4

−0.2

ξ−0.46−0.16

ξ−0.53−0.26

−0.2

γ 0.88

−0.8

γ 0.9

−0.8 −1

1

γ η ξ λ δ 0.78 −0.83 0.66 0.43

−0.2

γ η ξ λ δ 0.96 −0.89 0.88 0.65

1

1 0.8

0.6

0.6

0.4

0.4

−0.2

−0.6 −0.8 −1

λ−0.83 0.94 0.69

0.2 0

ξ−0.47−0.17

ξ−0.67−0.38

−0.4

γ 0.89

γ 0.96

γ 0.88

−0.8 −1

0.2

γ η ξ λ δ 0.99 −0.94 0.96 0.82

−0.6 −0.8

−0.2

1 0.8 0.6

λ−0.92 0.97 0.83

0.4 0.2 0

ξ−0.83−0.65

−0.6 −0.8 −1

Figure 5: Correlation between the ML estimators of parameters (a).

10

−0.2

−1

−0.4

−0.6

0.2

−0.4

0

−0.2

0.4

BECab19

0.4

λ−0.72 0.9 0.62

1 0.8

0

ξ−0.83 −0.7

−0.8

0.6

0.2

λ−0.92 0.97 0.87

−0.6

ad23

0.8

−0.8

0.6

−1

ac22

0.8

0.2

γ η ξ λ δ 0.99 −0.93 0.96 0.87

−0.4

−0.6

−0.6

−1

0

−0.4

−0.6

−0.4

γ 0.88

λ−0.71 0.93 0.7

0.2 0

0

ξ−0.81−0.54

1 0.8 0.6

λ−0.68 0.91 0.62

−0.2

BECab17

0.4

BECab11

λ−0.91 0.96 0.73

γ η ξ λ δ 0.69 −0.79 0.66 0.52

1 0.8

0.2

−0.4

γ 0.92

−1

ad22

0.4

0

ξ−0.75−0.53

−0.8

0.6

−1

γ η ξ λ δ 0.97 −0.94 0.94 0.74

−0.8

0.4 0.2

λ−0.89 0.95 0.76

−0.6

0.6

−0.4

γ 0.97

−0.2

1 0.8 0.6

−0.4

γ 0.85

−0.6

−1

0

ξ−0.34−0.25

ξ−0.48−0.12

−0.2

ab18

0.8

0.2

γ η ξ λ δ 0.97 −0.92 0.91 0.74

0

−0.4

γ 0.07

−0.6

BECab7

λ−0.71 0.91 0.56

0.2 0

−0.4

γ 0.92

BECab14

1 0.8

0.6

0

ξ−0.33 −0.1

ad14

γ η ξ λ δ 0.68 −0.8 0.58 0.35

1 0.8

−0.2 −0.4

γ 0.93

−0.6 −0.8 −1

λ δ 1

BECad4

γ 1

ξ −1 λ −1

η 0.99 0.98

1

ad4

γ η ξ λ δ 0.82 −0.83 0.7 0.53

1 0.8

0.6

0.6

0.4

0.4

0.4

λ−0.71 0.92 0.71

0.2

ξ−0.49−0.25

−0.2

−0.8

γ η ξ λ δ 0.74 −0.79 0.54 0.33

1

0.6 0.4

λ−0.64 0.89 0.65

0

ξ−0.36−0.13

−0.2

−0.8

γ η ξ λ δ 0.45 −0.7 0.38 0.23

1

−0.8

γ η ξ λ δ 0.51 −0.71 0.4 0.23

0.6

0.4

0.4

λ−0.61 0.87 0.56

0.2 0

ξ−0.38−0.08

−0.2

ξ−0.34−0.06

−0.2

γ 0.88

−0.8 −1

λ−0.85 0.95 0.73

0.2

γ 0.88

−0.8 −1

−0.8

1 0.8

0.4 0.2 0

ξ−0.71−0.46

−0.2 −0.4

−0.6

−0.6

0.6

0

−0.4

−0.6

γ η ξ λ δ 0.97 −0.89 0.91 0.72

1 0.8

0.6

λ−0.64 0.88 0.56

−0.2

BECac36

0.4 0.2

0.2

−1

be7

1

0.4

−0.4

γ 0.12

−0.6

−1

0.8

1 0.8

0

ξ−0.68 0.25

−0.2

0.6

−0.4

γ 0.9

−0.8

ad16

0.8

λ−0.81 0.79 −0.5

0.2

−0.4

γ 0.86

−0.6

−1

0

ξ−0.34−0.09

ξ−0.37−0.08

−0.2

−0.8

0.6

0

−0.4

γ 0.92

−0.6

ab17

λ −0.6 0.88 0.6

λ−0.57 0.87 0.52

0.2

γ η ξ λ δ 0.53 −0.72 0.82 0.36

1

0.4

0.2

−0.6

−1

0.8

0.6

−1

γ η ξ λ δ 0.46 −0.68 0.38 0.24

γ η ξ λ δ−0.05−0.41 0.1 0.15

−0.2

BECac12

0.4

−0.4

γ 0.89

−0.8

be4

1

0.2

−0.4

γ 0.82

−0.6

−1

0.8

0.4

0

ξ−0.71−0.33

−0.2

0.6

0

ξ−0.45−0.14

−0.8

ad6

0.8

λ−0.83 0.96 0.64

0.2

−0.4

γ 0.91

−0.6

−1

cb16

λ−0.71 0.89 0.6

ξ −0.5 −0.26

−0.2

1 0.8 0.6

0

−0.4

γ 0.92

−0.6

−1

γ η ξ λ δ 0.7 −0.8 0.58 0.37

λ−0.73 0.92 0.69

0.2 0

−0.4

γ 0.99

BECab10

γ η ξ λ δ 0.79 −0.88 0.78 0.58

1 0.8

0.6

0

ξ −1 −0.98

ad27

γ η ξ λ δ 0.88 −0.83 0.76 0.55

1 0.8

−0.2 −0.4

γ 0.9

−0.6 −0.8 −1

−0.6 −0.8 −1

Figure 6: Correlation between the ML estimators of parameters (b).

0.8

0.05 0.04 0.03 0.02 0.01 0.00 0

20

40

60 t

80

1.0

100

0.2

0.4

0.6

0.8

1.0

u

2

4

6 t

8

10

1.0 c(0, W)

0.2

0.4

0.6

0.8

1.0 0.8 0.6

c(0, W)

0.2 0.0

0.0

0.2

0.4

0.6

0.8

1.0

u

0

50

100 t

150

200

h5BS(x, delta = 0.917, lambda = 61.993, xi = 1.758, gamma = 2.47, eta = 39) 0.000 0.010 0.020 0.030 0.0

0.6 u

h5BS(x, delta = 1.077, lambda = 15.872, xi = 3.481, gamma = −1.387, eta = 0.254) 0.00 0.01 0.02 0.03 0.0

0.4

BECac30

0.4

0.8 0.6

c(0, W) 0.2

h5BS(x, delta = 0.211, lambda = 0.714, xi = −0.01, gamma = −1.675, eta = 0.106) 0.05 0.10 0.15 0.20 0.25 0.0

0.2

0.4

0.8 0.6

Wn(u)

0.4 0.2 0.0 0.0

h(t)

BECab11

1.0

BECab7

1.0

ab2

0.0

0.2

0.4

0.6

0.8

1.0

u

0

50

100

150

200

250

300

t

Figure 7: Empirical scaled TTT plots of embryos in cleavage cycle 10 (first row) and corresponding hazard rates (second row).

11

0.005

0.015

h(t) 0.010

0.020

0.025

0.0 0.0

20 0.2

40 0.4 0.6

t 60

u 0.8

80 1.0

100

0.0 0.2

20 0.4 0.6

40

t 60

u 0.8

80 1.0

100

0.0 0.2

20 t

40 60

0.4 0.6

t 60

12

80

u 0.8

80

1.0

100

1.0

100 1.0

40

0.8

0.8

BECab17

0.6

0.0

h5BS(x, delta = 0.954, lambda = 24.992, xi = 1.8, gamma = −0.455, eta = 0.337) 0.00 0.01 0.02 0.03 0.04 0.0

u 0.0

0

0.0

20

0.2

20

BECab19

0.2

40

0.4 0.6

40 60

0.4 0.6

60

t

0.8

t 80

u 0.8

80

h5BS(x, delta = 0.994, lambda = 15.007, xi = 2.514, gamma = −1.255, eta = 0.444) 0.000 0.010 0.020 0.0

u

1.0

100

0.6

20

0.4

c(0, W)

100

0.2

0.4

80

0.0

0.2

0.03

0.2

0.2

0.2

0.2

0.2

0.6

0.6

0.6

0.4

0.6

c(0, W)

0.4

c(0, W)

0.4

c(0, W)

0.4

c(0, W)

0.6

Wn(u) 0.4

0.8

0.8

0.8

0.8

0.8

1.0

1.0

1.0

1.0

1.0

ad14

h5BS(x, delta = 0.803, lambda = 22.506, xi = 4.046, gamma = −0.444, eta = 0.45) 0.00 0.01 0.02 0.03 0.04 0.0

60

h5BS(x, delta = 0.945, lambda = 33.008, xi = 5.146, gamma = −0.443, eta = 0.436) 0.000 0.010 0.020 0.030 0.0

t

1.0

1.0

40

0.8

0.8

BECab14 0.6

0.6

0.02

h(t)

u

c(0, W)

20

0.4

0.4

100

0.2

0.2

1.0

0.01

ac22

h5BS(x, delta = 0.54, lambda = 2.69, xi = 20.074, gamma = −2.839, eta = 0.29) 0.005 0.010 0.015 0.020 0.0

0.8

80 0.0

0.6

60 h5BS(x, delta = 0.862, lambda = 21.015, xi = 4.176, gamma = −0.531, eta = 0.519) 0.000 0.010 0.020 0.030 0.0

t 1.0

c(0, W)

40 0.8

0.4

1.0

0.00

0.6

0.2

0.8

1.0

u

0.6

0.8

20 0.4

0.4

c(0, W)

0.6

Wn(u) 0.4 0 0.2

Figure 9: Empirical scaled TTT plots of embryos in cleavage cycle 11 (first row) and corresponding hazard rates (second row)(b).

h5BS(x, delta = 1.035, lambda = 12.038, xi = 11.157, gamma = −1.597, eta = 0.39) 0.000 0.010 0.020 0.030 0.0

0.2

0.2 0.0

h5BS(x, delta = 0.864, lambda = 6.72, xi = 5.091, gamma = −1.557, eta = 0.431) 0.00 0.01 0.02 0.03 0.04 0.0

0.000

ab18 ad22 ad23

1.0

100

0.0

0

0.0

0.2

20

0.2

20

0.4 u

40

BECad4

0.4

40

0.6 0.8 1.0

t 60 80 100

Figure 8: Empirical scaled TTT plots of embryos in cleavage cycle 11 (first row) and corresponding hazard rates (second row)(a).

cb16

u 0.6 0.8 1.0

t 60 80 100

0.04 0.03 0.02 0.01 0.00

20

40

60

80

t

1.0

0.2

0.4

0.6

0.8

u

100

20

40

60

80

t

1.0

0.0

0.2

0.4

0.6

0.8

u

100

20

40

60

80

t

1.0

1.0 c(0, W)

0.0

0.2

0.4

0.6

0.8

1.0 0.2

0.4

c(0, W)

0.6

0.8

1.0 0.8 0.6

c(0, W)

0.2 0.0

ad27

0.0

0.2

0.4

0.6

0.8

u

100

20

40

60

80

t

1.0

h5BS(x, delta = 0.845, lambda = 17, xi = 7.958, gamma = −0.727, eta = 0.645) 0.000 0.010 0.020

0.8

h5BS(x, delta = 0.81, lambda = 29.997, xi = 5.94, gamma = −0.265, eta = 0.444) 0.00 0.01 0.02 0.03 0.04 0.0

0.6 u

h5BS(x, delta = 0.74, lambda = 21.995, xi = 4.156, gamma = −0.402, eta = 0.906) 0.000 0.005 0.010 0.015 0.020 0.025 0.0

0.4

ad16

0.4

0.8 0.6

c(0, W) 0.2

h5BS(x, delta = 0.822, lambda = 22.003, xi = 4.046, gamma = −0.539, eta = 0.592) 0.000 0.010 0.020 0.030 0.0

0.2

0.4

0.8 0.6

Wn(u)

0.4 0.2 0.0 0.0

h(t)

ad6

1.0

ad4

1.0

ab17

0.0

0.2

0.4

0.6

0.8

1.0

80

100

u

100

20

40

60 t

Figure 10: Empirical scaled TTT plots of embryos in cleavage cycle 12 (first row) and corresponding hazard rates (second row)(a).

0.8

0.05 0.04 0.03 h(t) 0.02 0.01 0.00 0

20

40

60 t

80

1.0

100

0.2

0.4

0.6

0.8

u

0

20

40

60 t

80

1.0

100

0.0

0.2

0.4

0.6

0.8

u

20

40

60 t

80

1.0

100

1.0 c(0, W)

0.2

0.4

0.6

0.8

1.0 0.0

0.2

0.4

c(0, W)

0.6

0.8

1.0 0.8 0.6

c(0, W)

0.2 0.0

BECac36

0.0

0.2

0.4

0.6

0.8

u

20

40

60 t

80

h5BS(x, delta = 0.965, lambda = 13.014, xi = 0.897, gamma = −1.455, eta = 0.359) 0.000 0.010 0.020 0.030 0.0

0.6 u

h5BS(x, delta = 0.787, lambda = 13, xi = 6.504, gamma = −2.715, eta = 0.014) 0.00 0.01 0.02 0.03 0.04

0.4

h5BS(x, delta = 1.325, lambda = 46.048, xi = 7.529, gamma = −0.612, eta = 0.19) 0.000 0.010 0.020 0.030 0.0

0.2

BECac12

0.4

0.8 0.6

c(0, W) 0.0

h5BS(x, delta = 0.783, lambda = 25.998, xi = 2.919, gamma = −0.277, eta = 0.575) 0.00 0.01 0.02 0.03 0.0

0.2

0.4

0.8 0.6 0.4 0.0

0.2

Wn(u)

BECab10

1.0

be7

1.0

be4

1.0

100

0.0

0.2

0.4

0.6

0.8

1.0

60

80

100

u

0

20

40 t

Figure 11: Empirical scaled TTT plots of embryos in cleavage cycle 12 (first row) and corresponding hazard rates (second row)(b).

5

Conclusion

A new five-parameter BS distribution, which is the extension of four BS distribution, was introduced and its parameters was estimated using MLE approach. Since there are no explicit solutions to the MLE equations, due to their non-linearity and complicated form, the MLE of five parameters were achieved numerically. Moreover, the pdf, cdf, qf and hazard rate of the proposed five-parameter BS distribution were theoretically discussed. Simulation results as well The outcome of fitting 5BS distribution to gene expression profile confirmed that the 13

5BS distribution fits very well to both simulated and real-world data. also indicated that this distribution performs very well.

References [1] Birnbaum, Z. W., Saunders, S. C. (1969). A new family of life distributions. J. Appl. Probab., 6, 319–327. [2] Leiva, V., Barros, M., Paula, G. A., Galea, M. (2007). Influence diagnostics in log-BirnbaumSaunders regression models with censored data. Computat. Statist. Data Anal., 51, 5694– 5707. [3] Barros, M., Paula, G. A., Leiva, V. (2008). A new class of survival regression models with heavy-tailed errors: robustness and diagnostics. Lifetime Data Anal, 14, 316–332. [4] Kundu, D., Kannan, N., Balakrishnan, N. (2008). On the hazard function of BirnbaumSaunders distribution and associated inference. Computat. Statist. Data Anal., 52, 2692– 2702. [5] Podlaski, R. (2008). Characterization of diameter distribution data in near-natural forests using the Birnbaum-Saunders distribution. Can. J. Forest Res., 18, 518–526. [6] Lio, Y. L. and Park, C. (2008). A bootstrap control chart for Birnbaum-Saunders percentiles. Qual. Reliability Eng. Int., 24, 585–600. [7] Leiva, V. (2016). The Birnbaum-Saunders Distribution, Academic Press, Elsevier. [8] Athayde, E., Azevedo, C., Leiva, V., Sanhueza, A. (2012). About Birnbaum-Saunders Distributions Based on the Johnson System. Communications in Statistics-Theory and Methods, 41, 2061–2079. [9] Hassani, H. and Ghodsi, Z. (2017). Evaluating the analytical distribution of bicoid gene expression profile. Meta Gene, 14, 91–99. [10] Hassani, H., Kalantari, M. and Ghodsi, Z. (2019). Spatial gradient of bicoid is well explained by Birnbaum-Saunders distribution. Medical Hypotheses, 122, 73–81. [11] Marshall, A. W. and Olkin, I. (1997). A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika, 84, 641–652. [12] Aarset, M. V. (1987). How to identify a bathtub hazard rate, IEEE Trans. Reliab., 36, 106–108. [13] Delignette-Muller, M. L. and Dutang, C. (2015). fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34. URL http://www.jstatsoft.org/v64/i04/. [14] Driever, W. and Nsslein-Volhard, C. (1988). The bicoid protein determines position in the Drosophila embryo in a concentration-dependent manner. Cell, 54(1), pp.95-104. [15] Bergmann, S., Sandler, O., Sberro, H., Shnider, S., Schejter, E., Shilo, B.Z. and Barkai, N. (2007). Pre-steady-state decoding of the Bicoid morphogen gradient. PLoS Biol, 5(2), p.e46. 14

[16] Houchmandzadeh, B., Wieschaus, E. and Leibler, S. (2002). Establishment of developmental precision and proportions in the early Drosophila embryo. Nature, 415(6873), pp.798-802. [17] Gregor, T., Bialek, W., van Steveninck, R. R. D. R., Tank, D. W. and Wieschaus, E. F. (2005). Diffusion and scaling during early embryonic pattern formation. Proceedings of the National Academy of Sciences of the United States of America, 102(51), pp.18403-18407.

Conflict of Interest None.

15