College Homework Help on validity

I need to include:

1- Definition of validity

2- Its importance

3- Its main types with a brief definition for each type. It could be one sentence for each. However, the types used in this study ( face, content and construct) need more details.

4- The work I have written is highlighted in green. However, every statement or section between quotation marks is not mine. It is taken from other studies to be paraphrased if possible.

5- you do not need to read the sources in the word file itself. I put it for my self to take some useful expressions and paraphrase them.

6- I attached some useful studies that may help in paraphrasing some statements. When you open the pdf file, you can search for the word validity and u will find some sections highlighted by me.


In short, I need a 2 or 3 pages discusses the validity of my questionnaire. I will provide you with any documents may help in understanding the context of the study.



Validity of the questionnaire


Validity refers to measure the degree to which an instrument has been intended to measure. In other words, validity measures what it was designed to measure, and not something else. For example, when a researcher wants to measure students’ anxiety, for example, the instrument should measure students’ anxiety  and not something else such as students’ motivation.


“Validity is also dependent on the measurement measuring what it was designed to measure, and not something else instead”

validity is the extent to which a concept,[1] conclusion or measurement is well-founded and corresponds accurately to the real world. The word “valid” is derived from the Latin validus, meaning strong”. “The validity of a measurement tool (for example, a test in education) is considered to be the degree to which the tool measures what it claims to measure”. Validity refers to the extent to which an instrument really measures what it purports to measure and it is assessed through a number of processes”. “Validity measures the degree to which a study succeeds in measuring intended values and the extent to which differences found reflects true differences among the respondents (Cooper & Schindler, 2008). Cooper & Schindler (2008) went further to add that there are three types of validity tests: content, construct and criterion-related validity tests.”

Validity is described as the degree to which a research study measures what it intends to measure.  There are two main types of validity, internal and external.  Internal validity refers to the validity of the measurement and test itself, whereas external validity refers to the ability to generalise the findings to the target population.  Both are very important in analysing the appropriateness, meaningfulness and usefulness of a research study.  However, here I will focus on the validity of the measurement technique (i.e. internal validity).


Validity is the extent to which a measurement tool measures what it’s supposed to measure. Remember your thermometer? It’s measuring the room temperature, not your body temperature. Since it’s supposed to be measuring your body temperature, the thermometer is not valid.


Acording to ????, “Validity (similar to reliability) is based on matters of degrees; validity is not an all or nothing idea”. There are many different types of validity. “Messick (1995) points out that validity is a matter of degree, not absolutely valid or absolutely invalid”

“It is important to bear in mind that validity and reliability are not an all or none issue but a matter of degree”  [ Do u think I need to include this notion? If so, it needs paraphrasing]



The process of designing a Language Learning Orientation Questionnaire and testing its validity through a series of trials before the final version emerged as the primary methodological significance of this present study. This procedure involves two sessions of inter-rater reliability and trials with the draft questionnaires, cross-referenced to the results of interview questions.




“In this study categories were exhaustive and exclusive, meaning that all relevant concepts were represented in the coding scheme, which may provide an indication of good content validity (Neuendorf, 2002). Incorporating a qualitative study where categories were developed inductively from the manifest content of the text of content analysis may have also contributed to achieving content validity (Rourke & Anderson, 2004). An attempt was also made to provide thorough information about inter-rater agreement, training procedures for coders and examples of the coding scheme. As Rourke & Anderson (2004) suggested, this is another important step to be taken towards establishing validity. Empirical evidence for validity can also be gathered mainly through examination of group differences and through the use of alternative methods of data collection to corroborate the results of content analysis (Rourke & Anderson, 2004). This study showed that the developed coding scheme was sensitive to the differences between the different rehabilitation programmes, providing further evidence for its validity.”


The critical review research tool which was in the form of a questionnaire was presented to five external referees, who were faculty members from different institutions. The referees were asked to read each question and to comment on questionnaire items in terms of wording and content and to give their comments and suggestions for improving the scale. All members are listed in Appendix 5-

The aim of the test of the pedagogical design factors was to confirm with experts the validity of the content of the learning resources, and to determine whether it was educationally sound.

Establishing validity can be accomplished in many ways. Cohen mentioned many types of validity such as , , , , , , however, many researchers claim that there are 4four main types of validity which are ……… Regarding the present study , relevant types of validity will be discussed.


Importance: [ can you discuss it briefly]

Many researchers agree that testing the validity of an instrument used in a study is very important. Research has shown that validity plays a crucial role in


Cohel at ( 2007) point out that “Validity is an important key to effective re- search. If a piece of research is invalid then it is worthless. Validity is thus a requirement for both quantitative and qualitative/naturalistic re- search”

“Validity is one of the most important concepts in survey research. Without validity, you have meaningless results and have wasted a great deal of time, energy, and money. Much of the confusion surrounding validity is probably due to the cavalier manner in which the term frequently is used. This Research Note presents general issues surrounding validity, as well as different types of validity.”

Validity is important because it can help determine what types of tests to use, and help to make sure researchers are using methods that are not only ethical, and cost-effective, but also a method that truly measures the idea or construct in question.


My Validation process:


In this study, the validity of the questionnaire was concerned. face validity, content validity and construct validity were assessed to examine whether the questionnaire measures what it was intended to measure.

“As noted by Sohlberg & Mateer (2001), realistically a measure cannot fully capture all the relevant aspects of a construct and, therefore, more than one measure would be required in order to acquire a comprehensive appreciation of the construct. Similarly to face validity, the evaluation of content validity is based on subjective judgments made by experts. Further evidence would be, however, needed in order to support the value of the measure in relation to the theoretical construct it is supposed to measure. “


Construct validation requires an on-going process where specific hypotheses are formed based on the theoretical construct and examined against data collected through the measure. For example, the measure is expected to correlate with similar variables (convergent validity) whereas low correlations should be seen between the measure and other unrelated constructs (discriminant validity) (Bowling, 2009). In this way, the scientific integrity of both the theory of the construct and the instrument is examined (Sohlberg & Mateer, 2001). Users may also be interested in the degree to which data obtained from the instrument are consistent with other observable criteria (concurrent validity) or in whether these data can predict future performances, such as which patients will benefit the most from rehabilitation (predictive validity).





Face validity

It refers to the content of the concept in question. It was used a supplemental form of validity. According to Bryman (2012), face validity might be reflected through


Face validity refers to the extent to which a tool appears to assess what it was deigned to assess. In other words, “Face validity can be described as a sense that the questionnaire looks like it measures what it was intended to measure”. “Were the questions phrased appropriately? Did the options for responding seem appropriate?”  “Face validity is only considered to be a superficial measure of validity, unlike construct validity and content validity because is not really about what the measurement procedure actually measures, but what it appears to measure”

Validating the questionnaire: several drafts of the questionnaire went through a series of validation processes, following the questionnaire design guidelines proposed by Oppenheim (1992), Dornyei (2003) and Bradburn et al. (2004). [I also took several steps to increase the validity of the questionnaire answers ]

A number of validation processes were used.  Firstly, the researcher gave it to two Saudi students to check the clarity. Then 2 English teachers who have masters in Tesol and one phd student in teosl.  Then the pilot study.




Face validity is the extent to which a test is subjectively viewed as covering the concept it purports to measure. It refers to the transparency or relevance of a test as it appears to test participants.[1][2] In other words, a test can be said to have face validity if it “looks like” it is going to measure what it is supposed to measure.[3] For instance, if you prepare a test to measure whether students can perform multiplication, and the people you show it to all agree that it looks like a good test of multiplication ability, you have shown the face validity of your test. Face validity is often contrasted with content validity and construct validity.


Some people use the term face validity only to refer to the validity of a test to observers who are not expert in testing methodologies. For instance, if you have a test that is designed to measure whether children are good spellers, and you ask their parents whether the test is a good test, you are studying the face validity of the test. If you ask an expert in testing spelling, some people would argue that you are not testing face validity.[4] This distinction seems too careful for most applications.[citation needed] Generally, face validity means that the test “looks like” it will work, as opposed to “has been shown to work”.


The questionnaire was given to a number of students to comment on the clarity of items. Face validity was assessed by supervisors and teachers experienced in the area of TESOL.

Face validity test was conducted in which four independent judges were invited to participate

To ensure face validity, the researcher presented the RSQ to a group of 10 Saudi EFL teachers, 10 EFL OU graduates, and three OU faculty members. They were given the first version of the instrument to comment on the clarity of items and suggest changes. (see appendix D).


The questionnaire was tested by two upper secondary students who provided insightful comments on the wording of the items and the structure of the questionnaire.

Face Validity and Pilot Test






A pilot study can assist in enhancing validity, reliability and practibability of questionnaire ( Oppoenheim, 1992; Morrison, 1993; Cohen et al , 2000)

– pilot “ feedback”

Content Validity

  • ask expertise in a field to judge whether it reflects the concept measured.
  • “Content validity” = you more systematically examine or inventory the aspects of the construct and determine whether you have captured them in your measures


“The extent to which a measure covers a representative sample of what it is intended to measure is referred to as content validity “(Anastasi & Urbina, 1997).


It refers to the degree which an instrument “covers a representative sample “of what it is designed to measure.

It is the degree to which an instrument ( a measurement) captures all aspects ( or covers the content ) that are relevant and representative of a construct.


Content validity is the extent to which the elements within a measurement procedure are relevant and representative of the construct that they will be used to measure (Haynes et al., 1995)

covers content that matches all relevant subject matter in its academic disciplin

Content validity was achieved by submitting the questionnaire to experts in the field of educational research and the field of teaching English as a second/foreign language to examine and evaluate the content and the format of the questionnaire before the final version was sent out to the participants.

“The test should evaluate only the content related to the field of study in a manner sufficiently representative, relevant and comprehensible.”


To assess content validity, beside checking the clarity of the items, 5 experts ( responses were obtained from 5 tesol experts to examine ….)  in the field of TESOL were asked to assess the content of questionnaire in terms of items relevance to the toping being measured. In other words, the experts were asked to examine how well the questionnaire items covered the content of the subject being measured; specifically how well the items identify the factors related to writing in English in Saudi context.

“In addition, bilingual experts reviewed both the English and Arabic versions of the survey to ensure the comparability of the instruments” . The researcher translated the questionnaire into Arabic.  Then the Arbic version was translated back into english by a bilingual expert who did not see the original English version. After that, the two English  versions were compared to check whether they have similar meaning.( the source is from Cohen et al, p.139, 7th ed


Finally, some questions were eliminated and revised. “The main research supervisor suggested the elimination of many items that were found unnecessary or which overlapped with other items in the questionnaire. He suggested some changes to the sentence structures to avoid using negatives and double negatives in the statements adopted from Cheng’s (2004) Second Language Writing Anxiety Inventory (SLWAI) and to avoid double-barrelled statements such as “Learning about different writing techniques and styles is important for me”, and on wording levels to accord with the target population samples’ proficiency level. He suggested linking the questionnaire items in order to assess the defined variables with the web-based learning platform characteristics, in order to portray a holistic image of how these two sources integrate in order to present condensed data. The new modified version of the questionnaire was submitted to the thesis co-supervisor and two faculty members to solicit feedback. A further modification was performed to execute a suggestion made by two of the questionnaire reviewers to rephrase a particular statement using neutral terms in order to avoid bias and the tendency to lead to a specific choice. Two doctoral colleagues and an experienced ESL English native speaking teacher were also solicited for feedback to ensure the clarity and appropriateness of the questionnaire statements to the targeted participants. Suggested changes were reviewed and the questionnaire was reedited to the final format prior to being piloted.”



however, one possible problem of using a questionnaire is that “responses may be inaccurate or incomplete because of the difficulty involved in describing learner- internal phenomena such as perceptions and attitudes, for example” (Mackey & Gass, 2005: 96). To solve such a problem and to check the validity of this questionnaire, the first version of the self-report strategy questionnaire was trialled with the researcher’s supervisor. In doing so, the supervisor validated the questionnaire by matching the strategies to their descriptions. Then the researcher discussed with the supervisor and revised the descriptions. In addition, two Thai students in Southampton were asked to correct any mistakes or unclear statements and give comments on the Thai version of this questionnaire. Finally, some questions were eliminated and revised. The final self- report strategy questionnaire consisted of 33 communication strategy statements (see Appendix A) and was administered to the participants in the main study. To measure the reliability of returned questionnaires, Cronbach’s alpha reliability coefficient, a measure of internal consistency, was also used in this study. Cronbach’s alpha analyses yielded reliability coefficients for the total scale of 0.78 before the CS instruction and 0.72 after the CS instruction. These results demonstrated that all the items in the questionnaire could measure the students’ CS use with enough consistency (see Pallant, 2007: 98). An example of CS statements in the self-report strategy questionnaire is shown below in Figure 3.2.




Content validity

According to Cooper & Schindler (2008), content validity refers to the degree to which the content of the items adequately represent the universe of all relevant items under study. The content validity of this research was validated by determining the variables which have been defined and used previously in the literature (Churchill & Iacobucci, 2009). Furthermore, three marketing professors were asked to review the questionnaire before it was pre- tested. Moreover, in order to elicit comments about the content validity, respondents were asked to describe any difficulties they had in completing the questionnaire accurately.”


Validity of the questionnaire

The researcher anticipated the need to ensure the validity of the questionnaire built for the purpose of this study, and content validity is usually established by professionals who select appropriate content for questions and statements. This leads to the results of a questionnaire or survey being considered valid if the questions are appropriate and necessary to identify a specific attribute, state or quality (Glenn, 2008). In this case, the pedagogical design criteria were based on Nielson‘s (1994) work, with some additions influenced by Squires and Preece (1999), and Wright (2003).

The questionnaire itself was presented to external experts, who were experienced photography teachers from different countries including Bahrain, United Kingdom, Oman, Kuwait and Canada. They were asked through the questionnaire to validate the content and pedagogical design of a new Photography (Depth of Field) e-learning application, and they were found to be in total agreement that the items of the questionnaire measured what it was supposed to measure. The following section presents the detailed results of this validation process.


The comments were very useful for the researcher to improve the prototype, and the improved version of the final prototype questions was used for the pre- test and the post-test.

Validity and Reliability of the Questionnaire

The engagement, enjoyment and learning questionnaire was presented to five external referees, who were faculty members from different institutions. All referees are listed in Appendix 9-7 of this study.

The referees were asked to read each question and to comment on the items in terms of wording and content, and to give their comments and suggestions for improving the questionnaire measurement.

The questionnaire was administered to twenty-five students at the College of Education, selected at random, to check the reliability of the questionnaire. Cronbach‘s Alpha was used to determine the reliability of each factor measured in the questionnaire, and a high level of reliability was indicated (See Table 9.5), with values of .906 for effectiveness, .885 for engagement, and .881 for enjoyment.


The EFL teacher knowledge and practice questionnaire consists of three parts. The first part asks the teachers to provide demographic information about themselves including name (optional), gender, qualifications, years of experience, school and stage taught. This part is important to describe the sample. It could also be used in inferential statistics. The second represents the content to be inquired into. It represents seven areas to be explored: knowledge of subject matter, knowledge of pedagogy, curricular knowledge, knowledge about students, knowledge about self, knowledge about contextual factors, and sources of teacher knowledge. The third part of the questionnaire is open-ended for the respondents to add their additions and comments. It also asks the respondent to write their contact details if they are willing to participate in the other phases of the research. The questionnaire ends with thanking the respondents for completing the questionnaire (see appendix A).

The construction of the questionnaire was informed by three sources: review of related literature and similar instruments, the advice of TEFL experts, and the personal experience working as a preparatory and secondary school EFL teacher as well as a teaching assistant at the university for six years from 2000 to 2006. A


thorough review of relevant literature to the EFL teacher knowledge and practice provided familiarity with the range of issues related to teacher knowledge and practice. Once the statements were decided upon, the questionnaire was sent to a panel of TEFL experts familiar with the Egyptian context to ensure the content validity of the questionnaire. Among the panel were the head of the TEFL department of the home university where I belong, other TEFL staff members and colleagues. Responses were obtained from 5 TEFL experts who helped in refining the questionnaire before its implementation. The items of the questionnaire were also validated concurrently by the teachers‟ responses in the interviews covering the issues included in the questionnaire.

Apart from face validity, the findings of the questionnaire were checked against either common sense and/or the participants‟ justifications of their views. The findings were also examined in relation to existing literature. Not all findings would be expected to match with existing literature. Otherwise, the originality of the current research might be jeopardized. Nevertheless, this attempt had been made to support the argument that the convergence between some of the findings of the current research and those of existing literature could be a step forward to ensure the validity of the questionnaire and the credibility of the participants‟ responses. In addition, some of the questionnaire findings were compared to findings of similar items existing in related literature. Thus, the concurrent validity of the questionnaire or some parts of it could be ensured.

The questionnaire in its final version consists of 42 items representing various areas of teacher knowledge, contextual factors, and sources of teacher knowledge. The design of the questionnaire followed the 5-item-likert scale. This design is widely used and easy to construct. It could also provide precise information about a respondent‟s degree of agreement or disagreement (Oppenheim, 2001). The overall consistency measure of the questionnaire was .731. This level is acceptable given that the study is mainly interpretive.


7.1Validity and reliability of the results

A specialised committee comprising a small group of academics from the Department of Curriculum and Instruction at the College of Basic Education have already prepared and checked the validity and reliability of all the achievement tests of the Course (304). And all types of questions in both mid-term and final tests of the course were objective test questions (MCQ, true-false, matching). This type of questions has right or wrong answers only. This means that students’ response to this type of questions can be evaluated objectively. Also, it can be seen how the results of the qualitative data supported the results of the quantitative data in terms of what students in the first and second experimental groups mentioned in the interviews, namely, that the teaching method of blended learning used in this course (304) was very effective in facilitating, explaining and clarifying the course content as well as helping them save up on study time for the mid-term and final exams. As they put it, they did not need any intensive study on the ‘test night’. This is about the validity and reliability of the results from the first question of the research (students’ achievements).

As to the results of the second question of the research (students’ satisfaction), I have

explained in detail in the methodology chapter how I checked the validity and reliability under the supervision, guidance and cooperation of my Supervisor and some other academics at the College of Basic Education. It can also be seen how the results of the qualitative data supported the results of the quantitative data in terms of what students in the first and second experimental groups mentioned in the interviews about the teaching method of blended learning used in this course (304), and how they were very satisfied with this particular teaching method. The results obtained by the quantitative methods (satisfaction questionnaire) matched up with the results obtained from the qualitative methods (interviews). This is confirmed by the data from both the satisfaction questionnaire and interviews. The statements on the first dimension of the satisfaction questionnaire (the teaching method) showed greater satisfaction among the two experimental groups than the control group. This dimension contained statements such as the following: I enjoyed the


teaching method in this course; the teaching method in this course helped me understand the content; the teaching method encouraged me to exert more effort in studying. And it is to be noted that the students in the first and second experimental groups expressed their views over such statements during the interviews.

For the group interviews, I consulted the interviewee-students about a suitable and comfortable time and location for them to conduct the interviews in order to avoid any unwanted influence on their response during the interviews. All of them agreed to conduct the interview at the same time as the weekly lecture on the General Teaching Method Course (304), and to have it in the same room of the weekly lecture as well. I started the interview with an introduction as to the purpose of the interview. And I reminded them about the importance of this research, assuring them that all information will be translated to English and will be treated as fully confidential. During the interviews, the same method was used to ask the questions and provide an opportunity for all students to express their views, before asking them again to confirm the accuracy of the information they have provided, and this was done in all the interviews to be sure that all students have responded to the interview questions under the same conditions. All the interviewee-students accepted and agreed to use the digital voice recorder during the interviews to help with transcribing their responses.


As clarified above, data were collected and analyzed through quantitative and qualitative methods. The instruments were questionnaires, interviews and observations.

Validity and reliability were addressed differently at different stages of the research. In quantitative research, “validity might be improved through careful sampling, appropriate instrumentation and appropriate statistical treatments of the data” (Cohen et al., 2004, p.105). In qualitative research, however,

validity might be addressed through the honesty, depth, richness and scope of the data achieved, the participants approached, the extent of triangulation and the disinterestedness or objectivity of the researcher (Cohen et al., 2004, p.105).

In quantitative research reliability can be achieved through controllability, predictability, consistency and replicability of instrumentation, data and findings. However, in qualitative research reliability can be achieved through the match between the collected data and what actually happens in the natural setting researched, in other words, „a degree of accuracy and comprehensiveness of coverage‟ (Bogdan and Biklen, 1992, p.48 cited in Cohen et al., 2004, p.119).

Therefore, I chose a questionnaire survey which I felt it would provide consistent and replicable results. On the other hand, I needed to know if what the teachers said in the survey was matched by actual practice, so that maximum reliability could be achieved in the study. For this reason, I interviewed the teachers and observed their actual teaching.

To increase the validity of data, two types of triangulation were employed: data triangulation and methodological triangulation. Data were collected through a variety of data sources (i.e. questionnaire, interview and observation) and quantitative and qualitative research methods were employed in the study since “the strengths of one approach can compensate for the weakness of another approach” (Marshall and Rossman, 1989, pp.79- 111, cited in Patton, 2002, p.306) and “no single source of information can be trusted to provide a comprehensive perspective” (Patton, 2002, p.306).


3.3.1. Questionnaire

To detect possible problems in advance and to modify the instrument before it was used in the actual study, the questionnaire was piloted twice on some EFL teachers at tertiary level. Piloting was carried out in the first week of March, 2007 with 20 EFL teachers at tertiary level.

When the questionnaire results were subjected to Pearson’s r correlation test, 3 pairs of items which were expected to correlate with each other did not correlate in the practice part of the questionnaire. These items were P23 & P32, P21 & P29, and P27 & P32. After the specified items had been modified, in other words, their wording was changed, a fortnight after the first piloting, the questionnaire was administered to the same participants for the second time and the responses were compared. It was found that the responses given to the unchanged items did not change but for the modified items they changed. The reliability coefficient of the specific questionnaire items which were designed to mean the same idea and thus expected to correlate was calculated and it was found that they all correlated. Then, the belief and practice items were subjected to relability tests respectively and the Cronbach alpha scores were calculated as .8580 and . 8240, respectively. Then the questionnaire was administered to 140 EFL teachers in Cyprus Turkish secondary state schools in the first week of April, 2007.

The questionnaire distributed to the EFL teachers consisted of two sections: (1) beliefs and (2) practice (see Appendix 1). It was prepared based on my knowledge about Traditional and Constructivist teaching gained through my 14 year teaching experience, through my experiences as a learner and through the knowledge that I had gained by reading relevant literature, and the ideas of the new curriculum. The questionnaire consisted of 34 items: 17 items about teachers‟ beliefs (first part) and 17 items about teachers‟ practice (second part). In both parts, a 5 point Likert-scale format was used because Likert scales are “powerful and useful”, “for they combine the opportunity for a flexible response with the ability to


determine frequencies, correlations and other forms of quantitative analysis. They afford the researcher the freedom to fuse measurement with opinion, quantity and quality” (Cohen et al., 2004, p.253). Besides, Likert scales are considered reliable (Oppenheim, 1997, p.200).

I strove for internal-consistency in the item selection to collect trustworthy data. For this purpose, some items focusing on the same idea were worded differently both in the beliefs and practice parts of the questionnaire. For example, in the beliefs part B3: Learners need to learn in a cooperative and collaborative environment and B11: A foreign language teacher should strive for maximum interaction among the learners were the two items designed to gather data about the teachers‟ perceptions regarding the value of interaction between and among learners in foreign language learning and teaching. Both of these items were designed to see the teachers‟ ideas about group work and pair work for interaction. Similarly, P23: I consider the differing needs of individual students when planning activities and P32: I consider the individual differences among my students were the two practice items designed to gather data regarding the value of paying attention to individual variations among learners in foreign language teaching.

The theoretical foundations of my questionnaire were „EFL Teacher‟s and Learner‟s Role‟, „Learning Environment‟, and „EFL Learning’. These theoretical concepts informed the construction of the Belief and Practice items.

In the questionnaire, the response categories for beliefs were: “Strongly agree, agree, uncertain, disagree, strongly disagree” and for practice: “Always, most of the time,


sometimes, seldom, never”. The questionnaire items were designed to elicit information about teacher‟s role and learners‟ role, learning environment, and EFL learning within the framework of the new curriculum, in other words, CLT and Constructivist framework.

These themes were reflected in both parts of the questionnaire. The items about beliefs had corresponding items in the practice part.

While the dependent variables were „beliefs‟ and „practice‟, the independent variables were „gender‟, „length of experience‟ and „qualification‟. To prevent participants‟ confusion, I explained them that the questionnaire was divided into two parts: the first part to gather data about their beliefs and the second part to elicit data about their perceived instructional practice. It needs to be acknowledged that I assumed that the teachers who disagreed with the Constructivist statements held more Traditional views, although it was difficult to be certain about teachers‟ replies without interviewing all of them. Since the questionnaire data would not help me explore the participants‟ subjective meanings, I conducted interviews to generate data yielding the teachers‟ subjective views and to get a better understanding regarding their actual practice I benefitted from the observational data.

The questionnaire was translated into Turkish by the researcher and by another colleague since it was the teachers‟ native language. The two translations were compared to see if they were the same, and one version was arrived at. Then, this was given to another colleague to back-translate into English, to check the reliability of the translation. The back-translation resulted in different wording for some items so I asked the advice of another colleague who was expert in translation. Then, the necessary modifications were made in the light of his advice.

Various measures were taken to ensure a maximum return rate. Feedback received from the participants of the questionnaire at the piloting stage indicated that the questionnaire was easy and quick to fill in. Interesting questions were put at the beginning and the questions for demographic information were asked at the end. The participants were thanked at the beginning and at the end of the questionnaire. Also, they were informed about why their


participation was important and appreciated. The information for returning the questionnaire (i.e. where and how) was also supplied.

Including all the EFL teachers in Cyprus Turkish secondary state schools had the potential to strengthen the external validity of the study and thus make the quantitative findings more easily generalizable (Cohen et al., 2004, p. 109). However, the researcher was aware that it could never be guaranteed that all the EFL teachers would return the questionnaire. Since in surveys a bias that may result from non-response is a threat to the external validity of the study, the researcher visited the schools and gave the survey instrument to the participants herself. The questionnaires were distributed according to the previously allocated pseudo identity codes for individual teachers. This process helped me to determine the possible interview and observation participants by examining their responses to the demographic questions in the questionnaire. I especially wanted to have an equal number of experienced (i.e. minimum 6 years and above) male and female teachers from the pilot schools where the new curriculum was implemented. The rationale for choosing equal number of males and females was for being fair to both sex and see if there were any significant diffences in regard to gender.

The final response rate was 58 %, which meant out of 140 teachers 81 of them returned the questionnaires. Those who did not respond had different reasons. Some teachers did not want to take part in the investigation and did not give an excuse. Informal discussions indicate that most of the teachers seemed tired, bored and fed up with filling in the many, long and impractical questionnaires which had been given to them before my study. One group of teachers refused because of their heavy teaching loads. Some others were in a rush to get through the syllabus.




Construct Validity

– Factor analysis

Construct validity

Construct validity refers to the extent to which the constructs hypothetically relate to one another to measure a concept based on the theories underlying a research (Parasuraman et al., 2004; Malhotra, 1999; Zikmund, 2003). In this




study, factor analysis was performed to measure the dimensions of a concept as well as to identify which items were appropriate for each dimension. Furthermore, to achieve construct validity, the measurement should be assessed through convergent validity and discriminant validity. Convergent validity refers to the items purporting to measure the same construct correlates positively with one another (Parasuraman et al., 2004; Malhotra, 1999). On the other hand, the latter requires that an item does not correlate too highly with other items of different constructs (Malhotra, 1999; Hair et al., 2003). In this study, the correlation matrix and inter-construct correlation was used to analyse convergent and discriminant validity.

To confirm the construct validity, the measurement was assessed through convergent and discriminant validity. Convergent validity is shown when items that are used to measure the same variable correlate highly with one another.

sDiscriminant validity is shown when items correlate more highly with items intended to measure the same variable than with items used to measure a different variable.

In this phase of the study, exploratory factor analysis was performed to assess the convergent and discriminant validity. The factor analysis was performed on Part 2 of the questionnaire that measure the external factors that impact the success of SMEs with the exclusion of the items with low internal consistency. Thus, 48 items were subjected to the factor analysis using the principal component analysis as the extraction technique and varimax with Kaiser Normalization as the rotation method. Appendix C.4 shows the rotated component matrix using the principal component method and varimax rotation with 14 factors specified and significant factor loadings emphasized. Steenkamp & van Trijp (1991) argued that substantial and statistically significant factor loadings signify the existence of convergence validity with the recommended value of > .50 (Hildebrandt, 1987). Thus, the appendix confirms the convergent validity of all the constructs by showing that all of the items loading were significant and well above the acceptable cut-off-point of > .50


Factor analysis

“To give statistical meaning the analysis, factor analysis was undertaken before testing the hypotheses in order to reduce the number of variables and to detect structure in the relationship between variables as well as to discover the underlying constructs that explain the variance (Cooper & Schindler, 2008). Factor analysis was also performed to confirm the validity and reliability of the constructs of the questionnaire. Based on the results of the factor analysis, only items loading significantly on the factors were used in inferential analysis to test the hypotheses of the study. “



conclusion for validity:

“As has been highlighted by many authors (e.g. Anastazi & Urbina, 1997), validity is not a stable property of a measure but it should be established in relation to the particular use the measure is intended for. For example an instrument may be valid for use as a diagnostic tool for healthy elderly people, but be inappropriate for use as a measure of outcome for neurologically impaired individuals. “



Conclusion for validity and reliability

As a result, checking the validity and reliability of the instruments must thus rely, in part at least, on the triangulation of data sets, in order to compensate for the weakness of one instrument through the strength of another.




Leedy (1997) defines triangulation as the way in which different methods of data collection, varying data sources, different analyses or theories can be used to check the accuracy and validity of the findings. Creswell, (1994) puts forward the argument that the use of varying methods of data collection and analysis should lead to greater validity and reliability than a single method of data collection and analysis. Therefore, both qualitative and quantitative methods were used for the purpose of triangulation. The researcher is of the opinion that by deploying the qualitative and quantitative method data collection and analysis, the credibility of findings and interpretation of the findings can be enhanced as the evidence and theme emerges from different sources.





Data Analysis overview:


The quantitative data analysis aimed to achieve the first and second objectives of the present research study. Quantitative data were analysed using a descriptive analysis process, followed by an inferential analysis. The software that was used to analyse data was the Statistical Package for the Social Sciences (SPSS) software 16.0. Descriptive statistics were used to get a broad appreciation of the data collected. Factor analysis was then performed to confirm the validity and reliability of the constructs of the questionnaire related to the external factors. Based on the results of the factor analysis, only items loading significantly on the factors were then used in inferential analysis to test the hypotheses of the study.

Following the quantitative phase, a qualitative approach was adopted. More specifically, based on Geertz’s (1973) concept of thick description, an ethnographic approach was embraced in order to explain not just the behaviour but also its context. To achieve the purpose of the thick description, fifteen in- depth face-to-face semi-structured interviews were conducted with owner- managers of SMEs in Tangier that were selected based on a judgmental selection.

The qualitative data analysis aimed to achieve the third objective of the study. Qualitative data analysis was performed using thematic analysis that followed closely the six phases described by Braun & Clarke (2006). The software QSR NVivo 9 was used to facilitate the analysis. While the organizational skills and automation facilities offered by QSR NVivo 9 software helped in easing the repetitive tasks, the analysis itself pertained to the researcher.


Place your order now with
and experience the difference of letting the professionals do the work for you!
Our Process is Simple
Our Simple process

Order Now

Place Order