Abstract:

The research will be providing a more in depth analysis of the zero cross rating and its application to `the differentiation music and non music signals. It is merely not about the instrument grouping strategies, but it will encompass several techniques and methodologies used in average zero cross rating. Both of the short time energy and short time zero cross rating is the part of the findings. Different characteristics of catch and bass drum sounds help deal with the clamour encompassing. (Banchhor and Khan, 2012). For distinguishing the sounds, different approaches will be discussed in the research in accordance to distinguish between frequencies, wavelengths, and amplitudes. Grouping technique and the discrimination analysis are the most appropriate ones to distinguish in between different sounds effectively. Zero crossing rate is the most significant one to distinguish in a better way, along with the implementation of the methodology used in the MATLAB program. (Al Shoshan, 2018)

The accomplishment of the research includes the implementation of zero cross rating methodology to find out the threshold values of both musical and non musical files. A practical was conducted in the research, which helps in finding out the threshold values of both musical and the non musical files. The chosen threshold value of the zero cross rating yielded an 80% success rate when applied to new signals not used in deriving the threshold value, which is the main achievement of the research.

1. Introduction:

1.1 Background:

Music has been a part of everyone’s life, and in accordance to George Orwell, it is a sign of preventing the individuals from over thinking. While the most intrusive music is by the company telephone. Also, in the history one can’t distinguish in between the music and the non music signals. A Daedalus article in Nature (Jones, 1997), proposes a system that would differentiate in between these two, i.e., the speech and the music and let through non music signals but suppress or disrupt music signals. According to that proposal, the power spectrum in case of the music signals comes up in relatively regular peaks, and this pattern is an indication to music signals. This pattern can be applied for muting switch. The proposal is to connect it to a phone, and it connects or disconnects based on periodicity of the power spectrum. This same principle is still in use for the identification of the music and the non music signals. His proposed apparatus comprised a microphone, music detector, and the processor. This proceeds with the detection of the music signals and the incoming signals of various bands. This dissertation proposes to implement the proposed idea as a digital signal processing system.

A signal of different audio has different sounds which might be music, environmental, or other types of media. Differentiating between the music and non music signals is the main issue here. Various audio variations are presented with differential frequencies; some of them are music while some of them are non music.  (Atal & Rabiner, 1976). Differentiating between these two is not an easy task as in the case of humans, they can listen to any music, so as the case with computers. But many of the systems do have a low accuracy rate as well. Audios are present in a vast number of segmentations, and they do have multiple applications regarding that as well. Several audios are the part of the entertainment industry, commercial industry, usage of music, audio storing, and surveillance, (Banchhor and Khan, 2012).Audio signal classification has a great scope to differentiate between the categories of signals. Having a look at historical data, different databases such as compositional sound effects were tested in the past to create an obvious distinction. Another direction that was used in history was transcribable data. It differentiates the sound into three categories, including speech, music, and different sound effects. In short, other techniques were used in the past for the identification of the subject matter presented in media programs. (Al Shoshan, 2018)

Automatic Squelch Control (ASC) is used from controlling the background noise, and signal strength is required in this. The applications of ASC have different dimensions that include computing tools, consumer electronic, and automation equalization. Taxonomy of auditory signals play a prominent role in finding out different categories of music along with their differentiation. Other applications present in this scenario include discrete time signals (which is about a sequence of quantities and it does operate on a discrete time) and zero crossing rate. Zero crossing rates is highly used, and it is all about the frequency content which any signal is having. Zero crossing is based on the audio, and it is about the rate of the frequency. Broadband signal is another category which supports categorization of the speech, but this technique is not that much convincing based on estimation. On the other hand, the broadband signal can also be helpful if the signal is for a short time with average zero crossing rate. This way is highly helpful for a rough spectral view that can be obtained easily.  (Atal & Rabiner, 1976).

Double cross engagement is another methodology in which two things occur in the system at the same time. Two of them are insightfully significant and extraordinary. Two sounds produce in this case which are catch like and bass drum like. The sound layouts of both sounds are different in concern to connecting strategies. The arrangements are used to create the best melodic sounds and conceivable database. It means that the double cross engagement methodology helps in finding the layout of the sounds and can also create a melodic database.

MATLAB program is a highly effective platform to determine the zero cross rate and short term energy of the music signals. The program does a more in depth analysis with the interactive apps to compute the short time energy. The short term energy here will distinguish in between both musical and non musical signals with the identification of the speech signal. The short time energy is highly essential for the determination of the music and the non music signal. MATLAB program figures out both STE and short time zero crossing rate (STZCR) of the audio signals.

1.2 Rationale of the research:

The main focus of the research is on telling the difference between music and non music signals, along with the differentiation of the audio signals. The study will be reviewing different methodologies to choose the most appropriate one among all along with the calculations of the frequency, wavelength, and amplitude of music and non music signals. The aim of the research will be the application and role of zero cross rating to simplify the work by doing audio signal interpretation.

1.3 Research questions:

Following are the research questions to be addressed in the research:

  1. What is the difference between music and non music signals?
  2. What are the methods to classify music signals?
  3. How zero crossing rate impacts?
  4. How is effective zero crossing rate in the working procedure?
  5. How MATLAB measures zero crossing rate?

1.4 Research objectives:

Following are the research objectives:

  1. Different resources will be used in the study to make the research authentic and focused. A literature review will be performed to identify the range of methods currently used to tell the difference between different types of audio signal.
  2. The research will then be doing a more in depth analysis of the most promising method. Hence a diverse knowledge will be available.
  3. The chosen method will be applied to a set of example signals to tune the parameters of the method.
  4. Those parameters will then be verified to find the success rate of the method by classifying a second set of test signals not used to derive the parameters. Achieving objectives 3 and 4 will give a verified method for telling the difference between music and non music signals and will be an essential part of the project.
  5. With a successful classifier implemented, the project will then implement a means of disrupting any music signals by applying different delays to different frequency bands.

 

2. Literature Review

Perceiving objects in nature from the sounds they produce is seemingly the essential capacity of the hear able framework. A life form that can detect a danger a ways off has the upper hand (in the transformative sense) more than one that can’t. Acknowledgement is conceivable, to some extent, since acoustic highlights of sounds regularly sell out physical properties of their sources. As a straightforward model, huge articles will, in general produce sound vitality at frequencies lower than those created by little items. If a life’s form probably perceives sounds as emerging from specific source classes, acknowledgement ought to be because of those acoustic highlights that are invariant over the sounds inside each class yet recognize the sounds of various classes. For some classes of sound sources, acoustic attributes that relate to physical or social properties are instances of such profoundly prejudicial highlights. Fruitful programmed order of melodic sounds is helpful in numerous applications grouping of sound records dispersed on the internet, programmed scoring of recorded music, programmed ordering of chronicles, sight and sound naming and numerous others(Qi & Hunt, 1993). Computational hear able scene examination (CASA), programmed music record systems also, content based hunt frameworks, all find such a capacity to be amazingly useful. Numerous endeavors in music instrument acknowledgement have taken place over the most recent thirty years. The more significant part of them have concentrated on single, confined notes (either orchestrated or regular) and tones taken from proficient sound information bases. Late works have worked on genuine chronicles, polyphonic or monophonic, multi instrumental or solo. Be that as it may, the issue is yet a long way from being unraveled.

The work on acknowledgement from independent notes despite everything stays pivotal, since it can prompt further enhancement of the techniques utilized and to bits of knowledge on the acknowledgement of multi instrumental (having different instruments). Most of the acknowledgement frameworks utilized up until this point focus on the timbral ghastly attributes of the notes. Segregation depends on highlights, for example, pitch (quality of the sound on the base of ordered frequency), vitality proportions (capacity or contribution), ghastly envelopes and Mel recurrence cepstral coefficients (representing a short term power of a spectrum based on linear cosine). Transient highlights, other than assault, term and tremolo, are only from time to time considered. Order is finished utilizing k NN classifiers, HMM, Kohonen SOM and Neural Networks. A constraint of such strategies is that in genuine instruments, the phantom highlights of the sound are rarely consistent. In any event, when a similar note is being played, the unearthly parts change. One needs to take into thought numerous timbral segments and how they can shift, which is regularly somewhat irregular, to build up a robust acknowledgement framework. (Siegel, 1979).

2.1 Zero Crossing Rate

From the respect of discrete time, zero crossing can be said as something that is happening because of many successive samples, and it has different algebraic signs. The rate through which zero crossing occurs is a measure of frequency. It is the measure of number of times within a given time of interval for the time amplitude of speech signals are passing through a zero value (Bachu, et al., 2008). Speech signals are considered to be broadband signals and interpretation of average rate of zero crossings and because of which it is exact. On the other hand, it can be seen that roughly estimating spectral properties can be obtained with the help of representation that is based on short time rate of average zero crossing. With any part of beneficial circumstances, it is seen that zero crossing is also having many drawbacks or demerits as well that can make it inaccurate for a few cases. It is observed in many cases that zero crossing within a segment is having numbers that are integers. It is needed that a measure should be continuous valued as it can help in progressing with detailed research and analysis. Only in the segments that are for the longer term measure is seen to be applicable; however, it is not beneficial for the shorter segments as it can be having very few zero crossings. To make the measure consistent, it has to be assumed that the mean of signals should be zero.

2.2 End Point Detection

Another problematic yet essential factor of speech processing is to identify the time when utterance in speech is starting and ending. This process is known as endpoint detection in the case of having unvoiced sounds in the beginning or at the end of any utterance, and it becomes tough to highlight accurately the signal of speech from the background of noise signal (Inbar, et al., 1986). In all of this context, it is seen that a speech threshold is determined, which is all based on the peak energy and the silence energy. On the start, it is seen that endpoints are mostly occurring on the point where signal energy is crossing the threshold. These errors in initial testing are rectified with the help of zero crossing rate in the presence of endpoints and with the help of comparing it with the silence. If there are changes in zero crossing rate that are detectable and outside the initial thresholds the endpoints mentioned are re designed towards the point at which the fluctuation is taking place. Since the sounds we manage might be blended in with commotion, an accurate envelope of the sign around the beginning is difficult to decide. Anyway, the decency of the assurance of the attack and decay times is legitimately connected to the exactness of the envelope. We decide envelopes by finding the limit of the waveform in windowed bits of the signal. Since the attack time is regularly short, and the sound is non symphonious (does not represent harmony in between sounds), we can’t utilize the FFT to decide the size of the window (Hägg & Suurküla, 1991). Instead, we decide the window size by a nearby pursuit calculation which stops when it finds a window size which is steady (regarding attack time estimation).Naming percussive sounds areas as attack and decay portions, we consider transient descriptors, for example, attack also, decay time; vitality boundaries, just as frequencies descriptors processed over various areas. Since a few sound areas can be generally little (around 30 milliseconds), we furthermore performed Prony which is a methodology for the determination of natural frequencies as well as the responses, to demonstrate over attack and decay areas. This will assist in the representation of enhanced frequencies precision. From this demonstrating, we keep the initial two coefficients (damping component and recurrence) of the frequencies part with the most elevated greatness, also as highlights, for example, the number of sinusoids found in each district. In the long run, zero intersection rates have additionally been registered in the locales characterized. The ZCR calculation was actualized with a worry for the treatment of two added substance commotions (noisy substances). (Shete, et al., 2014). The signals we are managing with are concise (regularly <200 ms), accordingly, a shallow recurrence note (w.r.t. the reverse of the term of the signal), played by a covering instrument (for example a bass), goes about as a troublesome component over the average level. The second sort of clamour we need to have the option to manage concerns the other instruments’ high recurrence parts (again w.r.t. the reverse of the term of the sign), which amplitudes can be viewed as mediocre compared to the percussive sound’s abundance around the beginning (for example voices, cymbals). These two qualities of signs are considered as clamour in the assurance of the ZCR of the percussive sounds (Conradsen, et al., 2011).

2.3 Speech Inclusion

Speech itself can be bifurcated (splitting) into many unvoiced and voiced regions. Dividing the speech signals into unvoiced and voiced provides a starting acoustic segmentation for the process of speech and its applications (Barnett & Kedem, 1991). These includes speech recognition, speech synthesis and speech enhancement. Voiced speech is having higher or lower constant frequency tones of fewer periods made at the time of vowels being spoken. Sound is produced in the presence of air and any case when then there is no air that sound can now be produced neither be transmitted.

On the other hand, it can be seen that any sort of unvoiced speech is for the non periodic sounds that are created randomly and it is caused by air which is passing through the vocal tract, and it has narrow constriction. In the current times, however, it is seen that the researchers make consistent efforts to produce a format through which all the collected data should be accurate and covering all the relevant part of the speech signal in this context zero crossing rate is introduced. In the long run, the grouping task tended to here is very explicit. Given any melodic title from enormous famous music databases, and making the presumption that the surge of occasions to order is comprised of just two groups of occasions, catch like sounds and drum like bass sounds what yields the recognition technique we address here the issue of their separation.

2.4 Automation Descriptor of Audio

Most of the work that is done on automatic descriptors of audio is focusing on a few elements—mid level or low level medium, small or middle size of data and so on. Rhythm, on the other hand, is the dimension of music perception that is widely focusing as the field of investigation in the music of computer. However, transcription does not understand the working flow here and can give away unprecedented results. Nonetheless, it is as yet an inadequately gotten marvel. Planning psychological models on mood recognition creating worthy documentation from a rundown of beginning occasions, for example, MIDI notes, and getting from it musicological reflections, similar to beat and meter, are as yet unsolved issues and are unmistakably out of the extension of this paper .A few words in regards to the programmed record of percussive music exist; in any case, there exists no reference portrayal of mood that can be utilized for characterization purposes. To deliver such a portrayal, we accept that we have to separate from the sound sign events of percussive tones; the explanation being that we target applications managing well known music, in which beat is a prevalent element that is for the most part given by a specific arrangement of tones: the drum sounds. The difficult we address here is the grouping of percussive tones into two classes: catch like and drum like bass sounds, as found in well known music titles. We now give a short outline of the discovery conspire (discovered strategy to formulate something) used to give time lists of events of percussive tones. The order task examined in this paper is progressive to this recognition and preceding the structure of more significant level portrayals of musicality. The reconciliation of the outcomes introduced here in a total framework for naturally removing musical structures from sound is the object of an imminent paper(Khan, et al., 2012).

2.5 Conclusion   

After a detailed analysis from the secondary research, zero crossing rate was selected as the methodology. In to the concluding remarks in order errands, which means usage of short and effective strategy for the sake of an achievement, is commonly comprehended that issues one is probably going to address are the accompanying: the kind of highlights to utilize, the genuine grouping technique to utilize and, on account of regulated investigation, the size of the information collected for the pre preparing stage. This paper will not present an audit of instrument grouping strategies. The exact order task we address here is not the same as the distinguishing proof assignment. However, the peruse might be utilized as well. As the physical characteristics of catch and bass drum sounds vary extraordinarily while thinking about enormous databases of titles, thus does the sort of clamour encompassing, the arrangement plot utilized should allow the classes limits to contrast starting with one given title then onto the next. This legitimizes the utilization of a non managed order procedure: Agglomerative Grouping (which is about the grouping of clusters based on similarities they have). Be that as it may, to figure removes, this strategy must be taken care of an information boundary: the measurement over which sounds are anticipated. To get pieces of information concerning significant measurements, we accept that characterizing a considerable number of percussive sounds’ highlights and accomplishing a Discriminant Factor Analysis over a little arrangement of sounds is advocated. As presented in the primary passage, the issue tended to here isn’t the plan of an all inclusive tone space over which one would accomplish instrument distinguishing pieces of proof. This research subject expects that monophonic and clean sounds are given. On the whole, this approach is essential as it is helping in finding the accuracy of wavelengths, and it can generate productive results for the researchers. Zero crossing rates will be the method for the analysis of the musical and the non musical files.

3. Methodology

In this research, the first approach which has been used is the qualitative way of research (which will about the analysis of difference in between musical and non musical files in concern to zero cross rating) as the area of study for the comprehensive research of the issue. This means that the research design is qualitative and comprehensive to analyze the zero crossing rate, the thorough reading and analysis of different authors and researchers has been done, to contribute to existing research. The publication from the well known researcher, including the research article and books has been used as the secondary source in the research. All the articles and readings would be read coherently to identify the inadequacy and the problems within it and after analyzing and determining the issue would be addressed. It would be provided with the recommendations. The research approach is using both the methods that are qualitative as well as quantitative and because of which it is listed as mixed approach which is a beneficial step because this research will be showing all the focuses from different perspectives that are needed in the research and can be contained with time and approach.

Research method for document analysis of archival data that was accessed via the internet. Furthermore, the method of research, which is the form of data collection and data analysis, is defined, and the authentication is provided on the way due to which this particular research methodology has been selected.

Qualitative methods will be used to investigate the wavelength of two different speeches, i.e. voiced and unvoiced speech. Zero crossing rate is the significant element, and to analyze the frequency of zero crossing and measurement of frequency. However, for the in depth study of the topic, the researcher has also adopted a quantitative approach to determine the close results for the topic. Because of its nature, the researcher has felt the need to add the quantitative research approach, and the data was present in numeric form or graphical form. The data has been presented along with tables and figures in this research, which is relevant to the topic for the better understanding of the topic.

One of the more straight forward ways with the help of which zero crossing rate can be calculated and measured is by looking into the smoothness of a signal. The numbers of zero crossing in a specific segment of signal are calculated. It is seen that the oscillation level of the signal is not that fast. For instance, a 100 Hz signal will cross zero 200 times per second whereas an unvoiced fricative can have 3000 zero crossing per second. For the calculation rate of signal that is having Zero crossing rate, the sign of each pair has to be calculated with the samples. The samples have to be consecutive. It can also be said in a way that let N be the length of needed signals of O(N) operations, and this is faster than calculating the frequency content of the signal using FFT, which takes O(N log N) operations. Zero crossing rates is highly significant to consider here as the evidence lies in the fact that the zero crossing rate is higher for high voiced files, and lower for low voiced files. These types of calculations are more straightforward and don’t require much time. All of these measures are supporting the zero crossing rate in making it a useful tool for the applications with low complexity.

It will be focused that during the research design energy calculation is combined with a zero crossing rate as it will help in giving relevant answers to the research. The energy was calculated by measuring the zero crossing out. The frequency of the signal spectrum at the rate in which the energy was concentrated—followed by that it is seen that zero crossing rate can be an essential factor for the classification of unvoiced or voiced—followed by that it is also used as the front end processing system within the automatic process of recognition of speech. The zero crossing count in this whole scenario will work as an indicator for the frequency within which energy will be focused on the spectrum of signal.

4. Resources

The audio connections were recorded from the internet for MATLAB. For the current research, there are some additional resources needed as well; for instance, a computing device or computing that is calculating Zero crossing rate to record all the data and analyze it. Moreover, zero crossing rate also requires sounds that will be collected from this resource. The collected data must have the accurate information, which means that it is error free. Followed by that digital data must also be accessed; this data is to check the audios that are having sound and frequency needed to be monitored. On the whole, this is a time taking process. However, relevant resources are used.

5. Limitations and Contingency Plan

Because of the challenges in current times, there are certain types of limitation that are/will be faced during the tenure of this research. The nature of research is more oriented towards engineering and because of which the basis is mainly on quantitative data. However, there was a limitation of data and other resources because of which more reliance was towards qualitative data that is also having evidence backing their research. There are not many people who are familiar with zero crossing rate and because of which it is implied that again reliance was shown towards secondary data that is already collected. Moreover, another major limitation is the data availability and because of which it is not assured that the data collected can be relevant or not.

Moreover, in this topic, there are very few researchers because of which the data can be homogeneous. The research is conducted in a short period and because of which there are higher chances that it will not be having a longitudinal effect because of not investing an excessive amount of time towards the work. This means that the research time was limited due to the short period.

There are contingency plans as well that can be technical as well as simple. Few of the ideas are to continue the research even after the course is completed because it will allow the researcher to invest ample amount of time in getting the results. They were followed by that there source of softcopy as well as hardcopy journals has to be sourced to get the results that are close to accurate. The researchers can also request to get the technology that is needed for the research purpose in the calculation of zero crossing rates as relevant answers can be gathered.

6. Zero crossing rate:

To find the zero crossing rate, we have coded one program in MATLAB.

Zero cross rating is defined as the rate of measure with which the sign changes along with the signals. This rate includes the signal change from positive to zero to negative and vice versa. Zero cross rating phenomenon occurs in both musical and non musical (speech recognition) points. This phenomenon is highly essential for the detection of human speech or any other musical segment. The strategy is known for voice activity detection (VAD). Here, the most effective measure is to find out the music signals along with the measurement of the smoothness. This methodology is proved out to be as the most effective one as it differentiates in between the musical and the non musical signals. The comparison can be made easily by comparing different samples with each other, and by doing the calculations following MATLAB program. (Al Shoshan et al., 2004)

As per the previous point of views, zero cross rating is only considered to be as a phenomenon that occurs due to the presence of different successive samples and the signs of algebra. But the rate of measure of zero crossing is the measure of frequency which differentiates in between the musical and the non musical signals. (Conradsen, 2011). Zero crossing is the measure of times by which the music signals are passing through the zero value. The broadband signals and the average zero rates here are interlinked with the speech signals, and this is the reason that the speech signal is a precise one. Zero cross rate is also used for drawing out the spectral properties, which will be differentiating in between the music and the non music signals. (Banchhor  and Khan, 2012).

Coding Explanation:

Figure 6.1 shows the Matlab code, this coding explains reading of an audio file which is in WAV format. A sample is saved in an array ‘y’ against it to store the values of frequency variable fs. After that the coding explains the calculation of fast Fourier transformation of signal y. Then, a vector of different frequency level has been calculated in FTT spectrum. After creation of the frequency vector, vector of time value was calculated. Time domain original signal was also a part of the coding. Magnitude and phase of the spectrum has been described in the coding where DC frequency point is shifted in the middle of the graph. Also plot spectrum and the magnitude was also calculated at different points in the same graph. After that, the zero crossing count is updated if the algebraic sign of the present sample is different from the previous sample. In the last zero crossing rate was measured per second, using the duration of the signal and the zero crossing count.

  1. %[y,Fs] = audioread ( handel.wav );
  2. [y,Fs] = audioread ( Music25.wav );
  3. %[y,Fs]= audioread ( guitartune.wav );
  4. NFFT = length(y);
  5. Y = fft(y,NFFT);

6 .F  =((0:1/NFFT:1 1/NFFT)*Fs). ;

  1. t =(0:1/Fs:(NFFT 1)/Fs);
  2. plot (t,y);
  3. soundsc(y,Fs);

11.magnitudeY = abs(fftshift(Y));

12.phaseY = unwrap(angle(fftshift(Y)));

13.figure();

14.subplot (211);

15.plot(F,log10(magnitude));

16.subplot (212);

17.plot (F,phaseY);

18.% find the zero crossings in y and count them

19.old_sign = 0;

20.crossing_count = 0;

21.for c = 1:length(y)

  1. if (sign(y(c)) ~= old sign)
  2. %sprintf( %s%d , zero crossing at t= , t(c))

24.crossing_count = crossing count + 1;

  1. end
  2. old_sign = sign(y(c));
  3. end
  4. sprintf( % %i , zero crossing count = , crossing count)
  5. crossing_rate = crossing count/(t(NFFT))

 

6.1 Methodology of finding zero cross rating:

The main focus of the research was to find out the threshold value in between the musical and non musical programs. The purpose of selecting zero cross rating is that it is considered to Beas highly efficient to code in MATLAB program. The reason for this is that the zero crossing rates is lower for the low voiced files and higher for high voiced files. This makes zero crossing rates effective. The practical proceeds with the measurement of both the music and the non music files. This evaluation, along with the coding of the program, helps in better finding of the threshold value. (Al Shoshan et al., 2004)

The procedure of finding out the musical file threshold value by coding a program in MATLAB is in this section.

Qualitative methodology is used for the investigation of the threshold value in musical signals. The analyzation of the coding and the graph obtained helps out in finding the zero cross rate, and comparing it with a threshold value. Measuring the frequency of the musical file was highly helpful here as it presents out the graphical form of the threshold value. For processing the calculation of the zero crossing signal and the threshold value, MATLAB coding program was used. Forty musical samples were used in this case, and every graphical illustration of the piece was calculated with the pair of each instance.

All the samples that were selected for finding out the threshold value were consecutive. Here consecutive means that N is supposed to be as the wavelength of the signals of O(N) operations. (G. C. Agarwal et al., 2010). Using the MATLAB program reduces the complexity of the calculation and evaluation of the threshold value. The samples of the energy calculation were combined with that of the zero cross rating, and it helps in the assessment of the findings. Zero cross rating is the most effective technique to be implemented in the automation process of the speech.

The practical performed with 40 sample signals not only determine the threshold value but also helps in the determination of frequency, wavelength, and the spectrum of the signals. All these factors help in the differentiation between the musical and non musical calls. The graph shown in Appendix A shows an example waveform of a musical sample. The graphical illustration not only detects the zero crossing rates, but also assists in the identification of the musical signals and the frequency. (Barnett &Kedem, 1991). The graph shows the differential frequency rate, and about the zero crossing rate. It reflects about the periodicity as well. The range in the values of MATLB helps in the identification of the minimum values. For example, Y is the vector in the image file, and then the content will be measured by the values present in the vector Y. the minimum amount in each vector is an indication of the threshold value. Each column has different spectrums, but in the musical pattern, the threshold value comes out to be different. (Banchhor  and Khan, 2012).

The operational function of coding in MATLAB was performed efficiently, and the need is to give appropriate coding commands. Every command must be logical to find out the correct values. A set of functions are required to find out the zero crossing rate in MATLAB. Following the list of coding commands, zero crossing rate was found out in MATLAB. The final result of the graph obtained after practically coding in the MATLAB program is present in the appendix. Graphical illustration was obtained after comparing it with the relevant samples. Music signal has a different spectrum than that of the non musical one. (Banchhor & Khan, 2012).

Checking the stimulation and the stop point helps in the illustration of the zero crossing event. All the coding details are added up in the solver details of the zero crossing option, and it tends to figure out about the authentic parameters of the zero cross rating. Zero cross rating has different parameters and can be prevented by changing the procedures required for figuring out the threshold values. Adaptive algorithm is the most efficient strategy to be considered here. (Atal &Rabiner, 1976).


 

7. Results:

7.1 Finding Threshold – “training process.”

After doing the coding of the musical signals practically, threshold value that is found out is 3350. In the last section of table 7.1, the threshold values of the musical files are present, referring to the zero crossing rate. The total number of 40 music files was used in the practical work. This value is referring to the zero crossing rates, and the average value is calculated, considering the threshold values of every signal. The graphical illustration present frequency, wavelength, and different spectral pattern as well. Also, to make the experimental findings valid, the musical files were compared with the other samples as well. The threshold values are always different in case of both music and the non music files. The graph will be presenting the various threshold values for every musical spectrum. (Zero Cross Rating MATLAB, 2020).

Table 7.1 shows the calculated zero crossing rate for 40 different sample signals. Half those signals were musical, and the other half were non musical. For each signal the type was noted (“music” or “non music”) by the investigator listening to each sample. The zero crossing rate was calculated, and a table formed, with the samples sorted by zero crossing rate, from smallest to largest. As expected based on the work identified in the literature review, sorting by zero crossing rate gave a relatively good segmentation of the results, with mostly non music signals at the top of the table, and mostly music at the bottom.

This observation led to the following method for telling the difference between music and non music signals. A threshold value is to be chosen, so that if a signal has a zero crossing rate below the threshold it will be classified as non music and if its zero crossing rate is above the threshold it will be classified as music. By adjusting the threshold, it is then possible to find a value for the threshold that minimizes the number of misclassifications. For the data listed in Table 7.1, it was found that a threshold value of 3530 minimized the errors.  The two right most columns of the table show the result of applying the threshold. It is then necessary to verify whether that is a reasonable threshold when applied to other signals, not used in the determination of the original threshold value. That process is described in the next section.

 

 

 

 

 

Type

Zero crossing rate

Classifier

Error

Music

2325.2

Non music

Error

Music

2574.8

Non music

Error

Non music

2637.4

Non music

OK

Non music

2778.5

Non music

OK

Music

2780.1

Non music

Error

Non music

2816.7

Non music

OK

Non music

2968.8

Non music

OK

Non music

3009.7

Non music

OK

Non music

3014.8

Non music

OK

Non music

3045.2

Non music

OK

Non music

3046.3

Non music

OK

Non music

3103.3

Non music

OK

Non music

3205.5

Non music

OK

Non music

3214.5

Non music

OK

Music

3221.9

Non music

Error

Non music

3230.2

Non music

OK

Non music

3242.1

Non music

OK

Music

3285.4

Non music

Error

Non music

3331.6

Non music

OK

Non music

3502.1

Non music

OK

Non music

3523.5

Non music

OK

Music

3546.6

Music

OK

Music

3569.8

Music

OK

Music

3591.6

Music

OK

Non music

3605

Music

Error

Music

3651.7

Music

OK

Non music

3669.6

Music

Error

Music

3720.5

Music

OK

Music

3865.6

Music

OK

Music

3871.1

Music

OK

Music

3984.8

Music

OK

Non music

4171.7

Music

Error

Music

4198.7

Music

OK

Non music

4226

Music

Error

Music

4587.9

Music

OK

Music

4620.8

Music

OK

Music

4675

Music

OK

Music

4740.5

Music

OK

Music

4977.5

Music

OK

Music

4979.8

Music

OK

            Table 7.1 – Table of zero crossing values and threshold classifications for a threshold of 3530. Right hand column indicates if classifications are OK or in Error.

 

Threshold

 

3530

 

 

7.2 Verifying Threshold.

To verify the chosen threshold, ten more signals, not used in the original determination of the threshold, were selected at random. Five of those were non music, and five were music signals.

Table 7.2 shows how those signals were classified using the previously determined threshold. As can be seen, the threshold led to a correct result in 80% of cases, which, given the random choice of these verification signals, gives some confidence that the chosen threshold will prove useful in general.

 

 

Name

Type

Zero crossing rate

Classifier

Error

Non music22

Non music

2552.1

Non music

OK

Non music24

Non music

3076.4

Non music

OK

Non music23

Non music

3125.2

Non music

OK

Non music21

Non music

3188.9

Non music

OK

Music25

Music

3285.5

Non music

Error

Music23

Music

3548.8

Music

OK

Music24

Music

3799.5

Music

OK

Music21

Music

3995.7

Music

OK

Music22

Music

4176.8

Music

OK

Non music25

Non music

4285.4

Music

Error

Table 7.2 –Table of zero crossing rates for new verification signals, classified using the original threshold.

 

 

TOTAL errors

2

Error %

20.00%

Success %

80.00%

 

 Conclusions and Further Work

According to the qualitative findings of the research, both musical and the non musical concepts are studied in detail to have an accurate analysis of zero cross rating. The research describes different methodologies of find of the difference in between musical and the non musical files. The possible method of differentiation in between them is Zero Crossing Rate, End Point Detection, Speech Inclusion, FTT, and Automation Descriptor of Audio. All these methodologies have been selected for analyzing a detailed different in between musical and the non musical files. But among these methods, the most efficient method to be considered here is the zero crossing rates in MATLAB. ZCR is considered to be as the most efficient one because it is highly efficient code than FTT. Zero crossing rates is highly effective method of coding than due to its effectiveness. All the 40 musical and the non musical files were coded using this methodology. The resulting value by this methodology comes out to be 3350 concerning table 7.2.

A detailed analysis of the previous investigations contribute mainly in finding out the zero cross rating by MATLAB. MATLAB assists in finding out the zero crossing rate after the analysis of a program on it. The methodology was actually about the significant changes along with the signals. According to the findings, MATLAB program is associated with both the musical and the non musical points. After a detailed analysis of the threshold value of the musical and the non musical files, the best choice of zero crossing value threshold comes out to be 3530. Both music and the non music files have different values, and this all depends on the differential frequencies in between them, with signals with zero crossing rate less than 3530 being classified as non music comparison in between the musical and the sample files was a clear indication of the differential spectrum and the zero crossing rate. The illustration point of the graphs indicated that both the musical and the non musical features have different parameters. These parameters help in finding out the threshold value about other parameters. The application made the conclusions of the methodology of the adaptive algorithm in MATLAB program. Every musical and the non musical band in the graph have different values which help in finding out the zero crossing rate threshold value. The research was all about the differential values in between musical and the non musical files along with the discussion of different methodologies used to classify both of them. Also, the measure of the threshold value was a practical stance of measuring zero crossing rates by MATLAB. The defined procedure by MATLAB measures the zero crossing rate of both the musical and the non musical files. This audio signal interpretation of the musical and the non musical signals along with the comparison to the sample files helps in the identification of zero crossing rate. In a nutshell, after running the program on MATLAB different threshold values were found out for both musical and the non musical files. It means that both musical and non musical signals are others.

 

8.1 Evaluation of project objectives

Section 1.4 of this dissertation sets out the objectives of the project. Here those objectives are evaluated in turn. The majority of the project objectives were achieved successfully.

  1. Literature review: The literature review described multiple ways for the detection and classification of the music signals. Different strategies were discussed here, which include zero cross rating, endpoint detection, and the speech inclusion. Automatic descriptor of the audio was also the part of literature review. By the analysis of all the methodologies, the most appropriate one to consider here is zero cross rating.
  2. In depth analysis of the most promising method. The most promising method, which is considered here is the zero cross rating. Zero cross rating is a highly effective strategy because it differentiates in between the musical and the non musical signals well. It also differentiates in between the low and high voiced signals effectively, which could not be possible using any other methodology.
  3. Method applied to a set of example signals For finding out the threshold value, the training process was used, which helps in differentiating in between the musical and the non musical signals. Forty musical and non musical files were used in total, and these files were compared with the sample files for finding out the threshold value. MATLAB program was used to find out the threshold value in zero crossing rate.

Parameters verified When the zero crossing rate threshold chosen in section 7.1 is applied to a set of new signals (not used to determine the threshold) those new signals are correctly classified with an 80% success rate. Therefore, it is concluded that the zero crossing rate method is successful.

  1. Disrupting any music signals With a successful classifier implemented, the project will then implement a means of disrupting any music signals by applying different delays to different frequency bands. This objective was not completed.

 

8.2 Summary of main project achievements

The aims and objectives of the research are the achievements of the research that makes the study valid. First of all, secondary sources were analyzed in a detailed way in the form of a literature review. The literature review identifies multiple methods used in the audio signals for their differentiation. Different techniques were analyzed in accordance to the previous researches. The research aim was achieved when different methodologies like zero crossing rate, endpoint detection, speech inclusion, and the automation descriptor of audio was searched out. All these methodologies were supported by the introduction of the previous researches available. After that, the second accomplishment of the study was about the identification of the most suitable method to differentiate between different types and spectrums of the audio signals. This achievement was possible with the introduction of the zero crossing rate as the most promising technique of differentiating in between different signals. Diverse knowledge was applied here to develop a sound understanding of the zero crossing rate methodology. This method was then introduced in some practical work of the research, which was about the introduction of MATLAB program. A program was run in MATLAB for studying different parameters of musical and non musical audio signals. This program helps in the achievement of the differentiation in between the music and the non music movements. The classification of multiple calls and tuning the parameters of these signals helps in the identification of the differences in between them. The implementation of MATLAB program achieved all these aims. Alternatively, for making the entire procedure, possible different samples of the musical and the non musical bands were also a part of the research as they help out in finding out the differentiating comparison between music and the non musical signals.

8.3Suggestions for further work

The aim that could not be achieved here is disruption of the musical signals. The future studies will be involving the disruption of the musical and non musical signals while the focus will be on the collection of the primary data for analysis of the disruption of signals. Primary data here means choosing the audio files and doing a detailed analysis on every file by the self in MATLAB. The primary data collection, along with the disruption of the music signals, will be included in further studies about various frequency bands. These frequency bands and turmoil in the music signals helps out in the better accomplishment of the research. Also, it will be more helpful in differentiating in between different signals.

With the advancement of knowledge, people will be more aware of the zero cross rating and the related concepts. The future research work will be more of the heterogeneous one (which will include a diverse range of musical and non musical signals of different spectrums), which will be including multiple researches at a single place. It will promote the research to be generalized. So, the collection of the primary data and focus on diverse concepts in disruption of the musical and non musical files will be included in the future work. As implementing the research first hand will help in dealing with the limitations of the research and weaker points that could not be encountered here in this research.

 

Bibliography:

Al Shoshan, A. I. (2018). Speech and Music Classification and Separation: A Review. Available at: https://www.sciencedirect.com/science/article/pii/S101836391830850X (Accessed: September 2, 2020).

Ahmadi, S. &Spanias, A., 1999. Cepstrum based pitch detection using a new statistical V/UV classification algorithm. IEEE Transactions on speech and audio processing, 7(3), pp. 333 338.

Atal, B. &Rabiner, L., 1976. A pattern recognition approach to voiced unvoiced silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(3), pp. 201 212.

Bachu, R., Kopparthi, S., Adapa, B. &Barkana, B., 2008. Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In American Society for Engineering Education (ASEE) Zone Conference Proceedings, pp. 1 7.

Banchhor, S. & Khan, A., 2012. Musical instrument recognition using zero crossing rate and short time energy. Musical instrument, 1(3), pp. 1 4.

Barnett, J. &Kedem, B., 1991. Zero crossing rates of functions of Gaussian processes. IEEE transactions on information theory, 37(4), pp. 1188 1194.

Conradsen, I. et al., 2011. Automated algorithm for generalized tonic clonic epileptic seizure onset detection based on sEMG zero crossing rate. IEEE Transactions on Biomedical Engineering, 59(2), pp. 579 585.

Hägg, G. &Suurküla, J., 1991. Zero crossing rate of electro my grams during occupational work and endurance tests as predictors for work related my algia in the shoulder/neck region. European journal of applied physiology and occupational physiology, 62(6), pp. 436 444.

Plonus, M. (2002). Music Signal, Music Signal an overview | Science Direct Topics. Available at: https://www.sciencedirect.com/topics/engineering/music signal (Accessed: September 2, 2020).

  1. C. Agarwal, G. L. G., C. N. Christakos, S. L., S. Cobb, A. F., DeLuca, C. J., D.Graupe, W. K. C., D.Hary, M. J. B., G. F.. Inbar, A. N., R.Kadefors, E. K., B.Kedem, E. S., Kedem, B., H.. Kranz, A. M. W., P. Lago, N. B. J., L. Lindstrom, R. M., L. Lindstrom, R. K., L. H.. Lindstrom, R. I. M., C. D. Marsden, J. C. M., H. S. Milner Brown, R. D. S., J. T. Mortimer, R. M., Rice, S. O., T. W.. Schweitzer, J. A. F., F. B.Stulen, C. J. D. L. and Zetterberg, L. H. (2010) Monitoring surface EMG spectral changes by the zero crossing rate, Medical & Biological Engineering Computing. Kluwer Academic Publishers. Available at: https://link.springer.com/article/10.1007/BF02441600 (Accessed: September 9, 2020).

Zero Cross Rating MATLAB (2020) Zero Crossing Rate, Zero Crossing Rate File Exchange MATLAB Central. Available at: https://www.mathworks.com/matlabcentral/fileexchange/31663 zero crossing rate (Accessed: September 9, 2020).

  1. Jones, “Daedalus: Music with everything”,NATURE, VOL 390, 18/25 DECEMBER 1997.

Appendix A

This graphical illustration shows the time domain traces of the left and right channels of a stereo musical audio file.

Related Post

Integrated Professional Skills in Digital Age

In terms of ICT, any digital technology that facilitates the acquisition and use of information by i

Introduction of Primark

Retailer Primark specializes on apparel, accessories, and footwear, and is one of the largest in the

Goals have a clearly established goal for the company

a healthier lifestyle has been developing, including customer goods and services. Governments and NG

Chat With Us +44-20-4520-0757
LET'S GET STARTED